Dedup + url extraction