iBegin Source: a sitemap beast

Remember this post on Google Sitemaps? It turned out that while Google downloaded the main sitemap file, it hadn’t downloaded the others. Instead of saying ‘wait up we are still downloading them’, it simply threw out ‘Errors.’

That earlier link was also before had finished geocoding everything. And with the updated sitemap ready, the # of pages ended up: 21,292,229

Anyone know of a bigger sitemap?

  • 1 Comment |

15 things iBegin Source does better

All about iBegin Source:

  1. Focus on local data. The competitors talk about businesses, consumers, mailing lists, and so forth. We are just about business data. Nothing more, nothing less.
  2. Nothing to hide. Company X can claim 14,000,000 records - but how can you be sure the data is any good? We have opened up our data for everyone to see.
  3. It is an open system. Are we missing a business? Add it in under a minute. Do we not have a URL that we should? Edit it and be done in 15 seconds flat.
  4. Cheap. Did I say cheap? I meant really cheap. Other companies try to charge you above $500,000. Not a single company charges under $100,000.
  5. No pushy sales team. No need to fill out a ‘quote request’ or go through a ’sales specialist’ trying to wring every single penny out of you. Simple automated-system
  6. Already geocoded. Geocoding over 10 million addresses is not a cheap endeavor. Does anyone else offer it? Nope.
  7. No extra charges. Daily, weekly, and monthly updates are available at no extra cost. The other providers? Get ready to pay more. Popular site? You pay extra based on how popular your site is.
  8. We support web standards. From hCard to WCAG Accessibility to the interestingly named ‘ICBM’ meta tag, we support it.
  9. Trackback system for automated updates to us. No data-provider lets you send them updates. We want those updates - send them to us.
  10. Free download (non-commercial). Yes we had to remove phone numbers. And yes it isn’t geocoded. But no one else lets you download information on over 10,000,000 US businesses without paying a dime.
  11. Expanding soon. We are only in the US right now, but in the upcoming months we are expanding worldwide, including to countries like Canada, the UK, New Zealand, and more.
  12. Use the data forever. Other data brokers make you pay a yearly fee (just to use the data) - not us.
  13. Franchises galore. We have created a system in extracting information on over a million franchise locations in the US.
  14. Experience. Unlike the other companies, we use the data ourselves for a local search site. We know what works, and what doesn’t.
  15. Not publicly mentioned, but once we see some interesting non-commercial applications, we intend on giving them a commercial license for free.

And because you want to: Digg This

  • 11 Comments |

iBegin Source is Live

Yep, it is finally here - iBegin Source.

I will be adding a post soon about ‘15 things iBegin Source does better’ soon.

And because you want to: Digg This

Note: Locking this thread for comments, add your comments to the 15 things iBegin Source does better post.

  • 0 Comments |

Relaxation is Overrated

Having driven 1200 km (roughly 750 miles) in 36 hours, I truly realize that driving a long way to ‘relax’ is a surefire recipe for disaster :)

Regardless, blogging will be light for a while. Nose to the grindstone for the impending launch of iBegin Source, and if it all goes right, it should be live in 24 hours.

  • 3 Comments |

And a bit of a break today…

iBegin Source [click to see magical quotes] is right around the corner, and that can only mean one thing for me: break time

Driving off to some interesting natural hot-springs resort with the missus to prepare the body for the onslaught. Be back soon :)

  • 0 Comments |

CSSVault Sells for $100,000

While I helped facilitate this sale a while ago (and there was a bit of a ‘quiet period’ going on with it), the go ahead to go public with it is here.

CSSVault, the venerable CSS directory was sold by BloggyNetwork to HostGator Web Hosting for a smooth $100,000.

Revenue and traffic are not going to be disclosed, but lets say the strong brand and strong (actual, and not wishful) potential was a big reason HostGator purchased it.

I will have a follow up interview with Brent Oxley, HostGator’s president soon. He has promised me that the site will be revamped soon. This isn’t a simple purchase - it’s an investment.

  • 10 Comments |

Just sold …

We just sold a notable site for a decent sum. Will post about it soon.

Read about CSSVault being sold.

  • 0 Comments |

StumbleUpon: You mean there are timezones?

After a regular search, I noticed an SU icon next to every single search result (odd). Clicking on it, I was met with the following (click for fullsize):

Good stuff - they are down, and they will be back up at 8pm.

But wait - 8pm where?

It was plausible that StumbleUpon is doing some IP tracing, figuring out where I live, and then being super-friendly and letting me know that it will be at 8pm. Unlikely, but maybe. A quick check with a friend in the Far East yielded the same message - 8pm.

Thanks guys, but that message is almost as useless as ‘back soon.’

  • 1 Comment |

Is it just me, or is there not a single web 2.0 company that is actually trying to make it work as an independent?

You may argue Facebook is trying to go that route (every since their facebook query language I have been a super-fan), but I only see it as posturing. They have reported being profitable (at roughly $150 million revenue a year), but other reports are leaking about horrific marketing results. The company is heavily invested in by VCs. And the quickest payout will be through a buyout, hoping Yahoo succumbs.

But lets even say Facebook holds out and goes IPO (or even stays private). What about everyone else? From Delicious to MySpace to unlaunched startups, everyone is getting snapped up.

‘Back’ in the ‘Web 1.0′, the companies wanted to be the winners. From Yahoo to Excite to Lycos, they had no desire to be acquired. They wanted it all. It seems the latest batch of ‘companies’ are just looking to cash out as quickly as possible. And with Alan Greenspan forecasting a possible recession within 9 months, maybe its a good move.

Then again, the web 1.0 survivors went on to become big winners.

  • 0 Comments |

XSS - Oh Dear

Yesterday I got a friendly email in the iBegin inbox. In a very professional manner, the person informed me of a Cross Site Scripting (XSS). I responded immediately, and a day later popped in the email. It was a simple example - with a certain string, my search page popped a nice JS error saying ‘XSS!’

This was rather bewildering. I have spent a lot of time researching over such holes, and here I was the victim of my own.

The end result was less spectacular than I had been fearing - when adding in the ad-code for Google, I had opted to use the ‘hint’ option. In my rush, I had never filtered the part where I dynamically inserted the keyword the user had searched for. And just like that a nasty nasty XSS hole was borne.

XSS is bloody scary. Basically with that info they can extract a lot of user info, allowing them to effectively take over their behavior. Heck MySpace was literally brought to its knees by a little XSS hack. And protection against XSS is like building a fortress - if your fortress even has one little hole in it, you are in trouble.

I’ve already mentioned how most ‘programmers’ on the web are crap. When you mix JavaScript in, thats like asking to be messed with.

  • 0 Comments |