There are a few headaches with local. While my current headache is as ambiguous as you can get, the persistent headache has always been search.
Search isn’t fun. Providing relevance for what a user is looking for is a painful task really - you take what a user searches, try to build relationships between what they put in to whatever categorization you have, factor in other variables (user rating, user favorites, distance from end user, etc), and try to throw out the most relevant results.
Not only that, but you have to scale it.
Simple search solutions break down once you hit 100,000+ unique records. More advanced search solutions start to groan under 1,000,000+ records. And at 10,000,000+ records, you have to be smart about it. The ‘where’ isn’t a simple text matching issue (which MySQL can handle for you) - there are multiple variables that are unique in each search. So pre-caching popular searches is also a dead-end solution.
Then you want to add in extra features. Metaphones and stemming, while relatively trivial (in basic implementation), become a bit of a headache when dealing with a large number of records.
When it is all said and done, generating our search engine (one time) takes about 72-96 hours. This is the new version - the current one on iBegin Source is quite slow - the new one should clock in at roughly 5x-10x faster. And it should finally be live sometimes next week.
I’ve always wondered what our competition uses.
A while ago I accidentally stumbled upon Judy’s Book search partner - Launch 21. The blog is quite detailed about the process behind Coupon Looker - interesting stuff.
So today I was intrigued when I saw Yelp’s Jobs page (with the title ‘About us’ for some reason). The first job posting was for a Senior Software Engineer - and the word that caught my eye was Lucene
I’ve spent a lot of time researching and understanding larger search systems, like Lucene, Xapian, Sphinx Search, and others. All are quite nice (and pack power), but molding an existing search system for local search isn’t an easy nor fun task.
I’m curious what other companies are doing (too many to name) - any clue? Are they custom (like us), outsourced/custom (like JB’s Coupon Looker), or an existing system modified (like Yelp)?
Read this first on The Local Onliner, BackFence, one of the more visible hyperlocal sites has bit the dust.
BackFence almost feels like the old grandpa of the entire hyperlocal market - there are now half a dozen other sites all trying to crack hyperlocal (including a few of our customers of iBegin Source). I find this remarkable that even with all the hyperlocal sites dying, they keep coming up.
Mind you - I am a big believer of hyperlocal. Our first foray into local was hyperlocal - we built a local community website for the little block we lived in. My fiancee walked around and took pictures of every place in the neighborhood, manually entering it into the database.
The uptake was amazing. Within two months the site was doing roughly 200-300 unique visitors a day. For a two block area, and with minimal promotion, it was quite an eye opener - it let us know that people are (in large quantities) attempting to find and connect locally.
Alas the site no longer remains. It was on one of our older servers, which suffered a catastrophic failure. The entire site was lost.
The most popular specific searches across our local properties are always brand names - Sears, [local grocery chain], McDonalds, etc etc.
What has always boggled my mind is if you visit a franchise/national brand store (eg Sears or McDonalds), they all have a franchise locater. It is an obvious feature that people would like.
But what absolutely boggles my mind is why this data is locked? If I was McDonalds, I would want to make sure everyone knew where you could find McDonalds. I would want to make sure all closed down stores were not listed. I would make this data available for free.
The reality is, no matter how much they may want to crawl back into their cocoon, people use other sites to find their brands. Even niche players get a significant enough chunk of traffic. Sure, McDonalds may talk directly to SuperPages.com. And YellowPages.com. You can argue that it isn’t profitable enough for them to have a direct relationship with everyone - fine. But is it really in their interests to allow other companies to publish bad data? Of course not.
If I could sit down the responsible for their internet operations, I would just have one question to ask: “Why don’t you allow anyone and everyone to download a list of all [insert name here] locations? All these people are doing is *promoting* your business”
There is a lot of talk about walled gardens of data, and how web 2.0 is suppoused to change that. There are some legitimate reasons for walled gardens, but for franchise data? None.
And I am happy to pronounce the launch of our geocoding service - iBegin Geocoder.
We now feature:
Whew ![]()
The first of my previously mentioned three releases, this is a soft launch of our geocoding service: iBegin Geocoder.
Basically enter any address in the US/Canada, and it can convert it to GPS coordinates. Enter any lat/long in the US/Canada, and it can give you the address it translates to, the nearest intersection, and the nearest major intersection.
Any application that works with user locality in US/Canada needs geocoding.
It should be ready by Wednesday for commercial usage and fully complete.
In the mean time - if you find any bugs, drop a comment.
UPDATE: Just wanted to add - the system is likely slow right now as we are working on some internal mechanics on the server. When we launch it will be butter smooth.
UPDATE 2: And our geocoder has launched.
Wired has the story how Geomas is suing Verizon, claiming that they infringe patent No. 5,930,474, for an “Internet Organizer for Accessing Geographically and Topically Based Information.”
The repercussions (if Geomas wins) could be far-reaching into the local sphere -
The patent describes an internet search functionality in which users can locate a topic or business based on their location. If you’ve ever looked for a nearby doctor or plumber online using your ZIP code or city, according to Geomas, the site you used likely infringed upon the patent. “In a perfect world, we commercialize the technology and grab licensing fees,” said Jason Galanis, founder of Geomas, which was formerly called Yellowone Investments. “We aren’t necessarily looking to sue as our main business, but realistically I think that’s going to have to happen.”
Praized has a few more thoughts.
iBegin Source has been a learning experience that has opened my eyes a lot - sales cycle, perceived value, etc etc.
We’ve had a lot of experience with doing small sales (< $500) - eg ForumTemplates (in the last 22 minutes the site has had five sales at $17.00 each). Automated processes, quick and detailed instructions, forum for general support (everyone can ‘learn’ together), etc.
iBegin Source has been a different beast. A few of the following points to learn:
All in all - very different from our previous sales experience (through our customers and our own stuff, we’ve pushed over $20,000,000 worth of ‘goods’ over the years).
So with all this in mind, we are going to slightly change our approach. Sales cycle, proof, and (to a certain degree) branding & perception - those are things with don’t have control over. We know we’ve been around for a while. We know we are cash-flow positive. We know that this data is mission critical important to us. But we cannot prove that immediately. Over time, people will see that not only are we still around, but we are thriving. Our sales are already up - time will only help.
In the meantime? We focus on iBegin v3 and iBegin Partners - more on that soon.
My recent posts have included one about Google opening up the directions API, and about Loki and its geo-location systems.
The next flood is open APIs - everything is opening up, and while it is exciting, it is also a bit overwhelming.
Beyond the above two (all great fits for iBegin), we have Garmin releasing an API to interact with its devices, we have Google Mapplets, and Facebook’s shift into a platform. And those are only a few. What about integration with login systems like OpenID and Yahoo? Exporting capabilities so others can create too?
I think we are reaching the point of so many powerful (ie - highly trafficked) sites having open APIs that it is becoming more and more important to have someone fulltime mashing your data with these systems. The above examples I gave are all perfect fits - figure out the closest gas station using Garmin. A mapplet for important categories like cafes or fast-food. A module so Facebook users can not only search but also incorporate their reviews, pictures, and events into the system. Allow Yahoo!/OpenID/Google ID users to login so they don’t have to create yet another account.
And the list goes on and on - whew … keeping up is becoming harder and harder.
There is a lot of talk about walled-garden et all, but I believe with the hyper-activity now going on in building out APIs that anyone can use, it is becoming more important to just by everywhere. Users don’t like being forced one way in another - but they do like it when you support a multitude of systems.
Companies were initially afraid of search engines - but then became best buddies with all the traffic they sent. Same thing happened with social networks - they were very resistant at first, but now you see Digg and Del.icio.us links everywhere. Sure they send traffic to Digg/Delicious/et all, but they get a lot of traffic back. And the same thing is going to happen (especially in the local space) with all these open APIs. Garmin works hard to get its users. Google is always angling new ways to keep users on their site. Facebook works hard to keep users on its site. It makes sense to leverage their platforms to get more traffic to your sites.
Think about this - a user (with a Yahoo account) ends up on your website. They want to add a review - but have to be logged in first. In one situation, you require them to create a new account. In another, they can login using their Yahoo account. The choice should be obvious.
I believe the ‘winners’ will be those who are found everywhere, on all the major platforms.
I’m not sure anyone else does this (as far as I know - they don’t), so I’m gonna toot my horn on this: customizable embedded YP listings.
An example (zero branding) of Ra in Scottsdale, AZ:
You can also see customization options here.
I usually eschew posting news, but to me, this is big.
Most sites I have come across that use Google Maps API send visitors to Google Maps to get directions. No need anymore - Google has released driving directions as part of their API.
This really makes me pause - why would they do this? I’m sure the directions issue was pushing a fair bit of traffic to Google Maps - are Yahoo/MSN/Ask/Mapquest far behind?