I partook in a small but interesting discussion a while ago about how bad local data is out there. Not just bad, but also impossible to clean up.
I’ve been taking the time lately to go back through iBegin and ’scrub’ our data. As it happens, the raw data we purchase is far from perfect (duplicates galore, mis-categorizations, etc). It is essentially a ‘risk’ we take. But that isn’t the end of it - even franchises suffer from big problems when it comes to local data.
Case in point: McDonald’s. You cannot get a more recognizable name. But do note its name - McDonald + ‘ + s. Not McDonalds, not Mc Donalds, not MacDonalds, or the other dozens of varieties.
So while we went through, pass-by-pass (basically you create rules, ‘run’ the rules on the data, tweak the rules, and then re-run) through our data, I wondered what my esteemed competition was upto.
Looks like not much. Checking them out:
Really my point here (amidst the connections in my brain) is that if companies cannot even get the data on the largest franchise in the world right, how are they going to cover data on small businesses?
Its a mind-boggling problem.
One Response to: Scrubbing Local Data
Yahoo Local: I’m Powerful. No Wait, I’m Confusing. - Tech Soapbox (ghost)
February 19th, 2007 at 7:12 pm
1
[...] When it comes to the US, Yahoo! Local is by far the best site. As I outlined in my previous post about scrubbing local data, they have taken extra steps to make sure their data is accurate and clean. They have a ton of data and information - from local reviews to web-results to even extra information gleaned from sources like Delicious. [...]
RSS feed for comments on this post· TrackBack URI
Leave a reply