Data Cleaning - Deduplication

Parallel computation model

Model equation
Reducer arrangement
Single block distribution

Record fusion and entity consolidation

Probabilistic method
U-clean

Source accuracy

Accuracy of a source

Last updated

Was this helpful?