Proving your point by being an idiot

In my last post on blogging and social responsibility, I outlined how blogs are going the way of tabloid rags by option for sensationalist headlines (to get more links, to get more pageviews, to get more money). My two examples were related to Muslims.

As expected, the vitriol came out - I got a few angry emails, and a few insane comments (including one claiming I must be a pedophile - *shrug*). But what was astounding is that one individual keeps on manually posting comments to my blog. What is hilarious is this person is now resorting to whining that my blog hates “freedom of speech” - missing the obvious connection that as this is my blog, I can do whatever I please to do with it. Even more astounding is that this has been going on for three days. And I know this is manual because I sometimes leave a comment, and he replies - before I go ahead and delete them both.

So to ‘prove’ to me that I hate freedom of speech and that I am a dirty Muslim pedophile, he resorts to spamming my blog, insisting I have to follow his rules.

It goes without saying that he doesn’t have the guts to use his real name and/or email.

  • 46 Comments |

Topix validates Anonymous Contribution

I’ve argued for a long time that having anonymous contributions is a good thing - even with the increase in spam, there is enough legitimate content to make it worthwhile. Plus - once a user has contributed, they are more likely to register and contribute even more.

A recent blog post by Topix validates my argument (huzzah!) The one surprise (for me) was that the amount of spam by non-registered was only 50% higher than registered. My experience is more along the lines of 100-200% more.

Of course there was no qualitative measurement done, but we could argue that in general about UGC ;) Skrenta also has some good takeaways from the #s.

  • 0 Comments |

I read an excellent post on structured vs unstructured data in the local space.

The problem about local data is an impossible human problem. People think differently. What is beautiful to me could be ugly to you. What could be a kebab to me could be a skewer to you. A car could be a piece of trash, and so forth and so forth.

On a related blog post, there was a discussion on building a better database. I’m not sure what Yellowbot was doing there (they just use Localeze data), but I am glad they were.

The entire argument of using a tagging system as your ‘base’ is shortsighted. Mostly because (as I explained) - people don’t see things similar. My previous examples were more generic - it gets even more confusing at the local data level. Is it a ‘gas station’ or a ’service station?’ A ‘doctor’ or a ‘medical practitioner?’ And so forth and so forth.

We were doing tagging in local space before anyone else (over 18 months now). You can see that users have taken it upon themselves to tag. Yet the same user can use different words when tagging an identical business (’dry cleaners’ vs ‘laundry’ - even when they provide the exact same service).

Our team has been slogging through the categories used in iBegin Source for roughly the last month, and I’ve never come across a bigger headache. Our task was relatively simple - merge, rename, prune the categories so that they are simpler to user and more obvious. But the breadth of business listings is enormous. Even getting it to 10,000 categories is a task not for the feint hearted (talk about constant cross referencing to possible matching categories).

So - where do we end?

The core data needs structure. At iBegin we had originally attempted extremely loose categories - 8 in total, tagging to control the rest. Even that caused problems - what about the establishment that is a restaurant until 10 pm, and then exclusively a bar from 10 pm to 2 am? And tagging was great in two ways - it allowed users to participate in a simple way (adding a word or two is relatively trivial), and it improved our meta data (the most important quality in local search). Multiple categories (eg the place is both restaurant and a bar) + tagging = where you want to be.

So whats the conclusion?

Categories are needed from a top-down level in order to classify businesses properly. A user based system cannot work because too much freedom leads to a mess that cannot be properly organized (much less properly monetized). Tagging on top is a great way to build up a taxonomy - cheap meta-data creation that augments your core classification.

  • 12 Comments |

The (late) Gnomedex Revue

So Gnomedex was fun.

Interesting truth - I had never met Jacob before. We’ve been doing business for years, we are about a two hour flight away, and yet we had never met up. Something always got in the way - so it was nice to finally meet him.

Coming from an outsider, I felt that Gnomedex was relatively cliquey. This was most evident during the retarded Calacanis / Dave Winer spat (more on that later). Still, most everyone was friendly - more than most other places. So cliquey yet more friendly - wrap that in your head if you can :)

Anyway - the talks were relatively interesting. The keynote ranted and raved - basically the government is messed up, and it needs to be open sourced (there was plenty of fodder in there for later discussion). The husband/wife duo at GeekBrief.tv struggled (I don’t want to be mean - but it didn’t seem like they had practiced at all. Plus finishing each others sentences while on stage is annoying). Guy Kawasaki pulled out a canned speech, but he commanded space and attention that truly was fascinating.

There were other talks too, but the best one was definitely Ignite Seattle. Everyone had only positive things to stay - basically 5 minutes, 15 seconds a slide, and GO! People talked about art, iphones, social networking, and having fun - it was extremely captivating.

The big ’story’ was the Calacanis / Dave Winer spat. It basically came across like this - Calacanis goes up and says spam is poisoning the internet, including Search. And then he talked about how great Mahalo was. A few people didn’t like it, and yelled at him for spamming the conference. One of them was Dave Winer. Calacanis’ feelings were hurted, and like any real man, he whined about it on his blog. And so forth and so forth.

The reality was that he didn’t offer any real solutions on the spam situation. He didn’t talk about the ‘war of escalation’ the search engines are playing vs the spammers. How linkbait (through the participation of blogs) is being used to stuff in spam content. Nope - just Mahalo this, Mahalo that.

Outside of the place, I hung out with some fellows Jacob had met up from last year. They were all terrific mellow people, with extremely interesting stories on how they ended up where they were today. None of that corporate ladder nonsense - real adventures.

Overall I had a blast. The people were unique (yes, best way to describe them) in a way no other conference I’ve been to can claim.

  • 0 Comments |

A Thought about Digg

There are many posts about how Digg traffic sucks. I agree with those sentiments (been Dugg about 25-30 times).

The earlier argument was that it lead to a lot of links. I used to agree - before. Nowadays, it seems like getting Dugg leads to no/very little linkage. That is if your story has made it - in the last 10 days we have had four pages/sites Dugg indirectly - our own submission was buried, but a site like Lifehacker or News.com (that just linked/regurgitated what we said) made it through.

So - if the traffic is useless, and its ability to generate links is weakened - why would anyone want to buy Digg?

Something to chew on.

  • 4 Comments |

My recent posts have included one about Google opening up the directions API, and about Loki and its geo-location systems.

The next flood is open APIs - everything is opening up, and while it is exciting, it is also a bit overwhelming.

Beyond the above two (all great fits for iBegin), we have Garmin releasing an API to interact with its devices, we have Google Mapplets, and Facebook’s shift into a platform. And those are only a few. What about integration with login systems like OpenID and Yahoo? Exporting capabilities so others can create too?

I think we are reaching the point of so many powerful (ie - highly trafficked) sites having open APIs that it is becoming more and more important to have someone fulltime mashing your data with these systems. The above examples I gave are all perfect fits - figure out the closest gas station using Garmin. A mapplet for important categories like cafes or fast-food. A module so Facebook users can not only search but also incorporate their reviews, pictures, and events into the system. Allow Yahoo!/OpenID/Google ID users to login so they don’t have to create yet another account.

And the list goes on and on - whew … keeping up is becoming harder and harder.

There is a lot of talk about walled-garden et all, but I believe with the hyper-activity now going on in building out APIs that anyone can use, it is becoming more important to just by everywhere. Users don’t like being forced one way in another - but they do like it when you support a multitude of systems.

Companies were initially afraid of search engines - but then became best buddies with all the traffic they sent. Same thing happened with social networks - they were very resistant at first, but now you see Digg and Del.icio.us links everywhere. Sure they send traffic to Digg/Delicious/et all, but they get a lot of traffic back. And the same thing is going to happen (especially in the local space) with all these open APIs. Garmin works hard to get its users. Google is always angling new ways to keep users on their site. Facebook works hard to keep users on its site. It makes sense to leverage their platforms to get more traffic to your sites.

Think about this - a user (with a Yahoo account) ends up on your website. They want to add a review - but have to be logged in first. In one situation, you require them to create a new account. In another, they can login using their Yahoo account. The choice should be obvious.

I believe the ‘winners’ will be those who are found everywhere, on all the major platforms.

  • 0 Comments |

This post is about making money from domain names. The reality is that domain parking is not only here, it is going to grow (and evolve). With companies that have some serious money behind them [100 million+] (eg GeoSign, DemandMedia, iReit, etc), domaining isn’t going to go away.

I’ve never been impressed by parked pages. Completely boilerplate, they look awful. I’ve never doubted that domains get a lot of traffic - I just don’t understand who actually clicks on those links. All my tech-clueless friends have said they come across these pages regularly, hate them, and never click on ads.

Regardless - innovation is afoot. The reality is domains get a lot of targeted traffic, and PPC is not the best way to extract maximum value. The standard PPC site can only go so far - you can spruce it up, add pretty colors, etc - but the underlying system is the same. The long term value is very low, and the growth of type-in traffic is pretty much flat lined.

We’ve dabbled in this lightly via iBegin - for example, Minnesota.com. The site has grown roughly 300% in traffic since we put up the new system - not too bad. And this is just the start - we intend on growing this out - reviews, photos, and better monetization (eg hotel affiliates, real-estate, florists, etc). But that is for another day.

So I was intrigued when a friend of mine told him about his upcoming system - Domainer.com. His first live example: Bags.com.

The idea is simple - take a domain, add in content via bloggers (who have agreed to have their content distributed), add some extra relevance via tags, and then sell the product directly (via Shopping.com) instead of using a PPC aggregator like Google or Yahoo.

What was interesting was that he was able to negotiate the ability to replicate blog posts completely on their site (eg Dior Flight Hobo). The idea is to be a win-win for both parties - the domain (getting intrinsic traffic) sends traffic to the blogger (who is properly cited), and the blogger acts as a content creator for the domain.

Domainer.com has taken steps that the original URL (for each blog post) be given its due - linking back and using the ‘cite’ attribute (created by W3):

The value of this attribute is a URI that designates a source document or message. This attribute is intended to give information about the source from which the quotation was borrowed.

Does Google care for it? Doesn’t seem like it.

I’m not a fan of SEO for parked domains - a site that *only* has a bunch of ad links should not be getting any search engine traffic. But I think bags.com may be the exception. Or it is close - just not there yet. While I like the content (you can think of it as the ugly cousin of an aggregator like popurls ), the presentation leaves me unhappy. The content is there, but it seems like it was jammed on the side so one can claim that they do have content. I’m not sure what the purpose was with the 468×60 banner on top - it links to a page just like the links on the left menu do. Why add such clutter?

I wish the site involved a bit more user-generated content. Does it cause moderation headaches? Yes. But in the grand scheme of search-engines, it also creates unique content. Let me get involved somehow.

Right now I would give the site a C+. It definitely extends the idea of Bags.com. It definitely makes it more compelling than a bunch of ad-links. But it falls short of being truly useful. I have no way of participating (not even an ‘email a friend’ link). The store is the complete focus, with the content on the side.

I think in the case of Bags.com a dual-pane approach for the index page may work better. Left side can say ‘Interested in buying bags?’ and the right side can say ‘Want to read about bags?’. At least give the content-side its fair shake.

Regardless - good first version, but it needs some updates before it truly becomes useful. Right now it feels like an SEOed storefront with some content on the side.

  • 4 Comments |

It has spread to every nook on the internet - Google picked up Doubleclick for a sweet $3.1 billion

With it, Google gains access to some big-name clients of DC - brand managers for major companies that have a lot of money. PPC is just a facet of internet advertising - there is also banner advertising (which Google has a small share of, and DC a large share of), CPA (which Google is expanding into, and DC has the highest quality network for), and email (which I don’t think Google is going into any time soon).

Anyway, the story goes that Google beat out Microsoft (again). And by beating out Microsoft, they also beat out Yahoo (after all, all three are going after the same - internet advertising).

Yahoo is looking especially vulnerable. Since picking up Flickr (which I still argue was mostly for guaranteed ad-inventory), and Delicious/MyBlogLog (both for user behaviour/tracking), they’ve been relatively silent. There was suppoused to be a deal for Facebook, but that has not produced anything.

If there is one product that I think Yahoo is pushing ahead of Google is local. Local is more than just local search though - it is about a presence in the local area. And while I argue Yahoo! Local is superior to Google Maps, Yahoo has really pushed ahead of Google with key partnerships with newspapers, the latest being with McClatchy.

So while Yahoo continues to sustain PR black-eyes, what can it do?

Buy up companies in the emerging local market.

With that in mind, and a way to get a leg-up on Google (for once), Yahoo could go and buy out (for relatively cheap) both Local.com and Yelp.

Local.com keeps making noises about how much traffic it pushes (roughly 11 million unique visitors a month). They claim $35 RPM (revenue per 1000 pageviews), and ~$90 per 1000 daily visitors. While their CPM doesn’t compare to Google’s, it has gone up significantly, and by utilizing Panama, it should be able to make even more. As Google and Yahoo jostle for the local market, Local.com and its 11 million unique visitors a month make perfect sense. There are some other things to consider (eg Local.com heavily invests in PPC), but a great domain with steady traffic could be a good call. And local.com needs money - it just received $8 million in funding.

The other site would be Yelp, which simply fits into Yahoo’s strategy of user-generated content: Jumpcut, Flickr, Upcoming.org, and Del.icio.us. User generated, active participation, without too much butting in by Yahoo (though I am sure quite a few Flickr fans would disagree). They would get their hands on some of the most loyal visitors in the local area (you don’t drop 400 reviews on a site and then just get up and leave). Yelp struggles with their ad-sales, but few sites get as much attention as they do (especially considering how much real traffic they do get). But that attention has helped them skyrocket, with Hitwise reporting that Yelp’s traffic has increased 91% in just six months.

Get the top independent local search site, and the top local review site - I do wonder what the valuation would come out to, but I am sure Yahoo has enough cash to scoop them both up.

  • 0 Comments |

With big sites like Blog Flux, when you deliver a service, you better have it ready for scaling.

We cache every single pagerank request to our Pagerank Checker. Obvious reason - PR updates rarely, and when it does, we can just flush the cache. In the meantime, if we keep asking Google the pagerank every time we get a request, we end up with 1) slower response time and 2) higher chance of being banned by Google.

I ran the math - we have a total of 33 gigs cached on our server. At an average size of 225 bytes per PR image (0-10), that comes out to almost 150,000,000 different URLs checked for pagerank by us! This is only a count on unique URLs - since the last pagerank reset we have delivered over 10 billion images.

I don’t remember the exact date, but this thread puts the last PR update at January 9. We have had roughly 90 days elapse since then. Number of images we’ve served up:

  • Every day: 111 million
  • Every hour: 4.6 million
  • Every minute: 77,000
  • Every second: 1300

Average response time is roughly 0.10 seconds.

For the domains we have PR cached for, the most popular first letter was ’s’, followed by ‘m’ (way behind). The least popular was ‘q’, with ‘z’ about 2x more popular. ‘Q’ was roughly 2.4% as popular as ’s’

  • 0 Comments |

Synergy - it can work!

Synergy was an awesome buzzword. Not only does it mean something, it just sounds awesome. If you had synergy, you couldn’t go wrong.

People have often asked me - is there something wrong with me? Do I have ADD? (not to imply ADD is wrong, but you get the idea). Why can’t I just *focus* on one thing and go with it?

The answer is simple: having resources in various areas makes it much easier to ‘push’ a product.

Case in point: Blog Flux Local

While not launched, the entire project is a very daunting task - we essentially want to catalog local content, and geocode it to where it belongs. Similar to outside.in, but really - more simplistic.

So one of our initial problems was - how do we figure out where a post is about? We can attempt to parse out street intersections etc, but that is haphazard. We can ask people for GPS coordinates (FeedBurner supports this) - but who the hell is gonna figure that out?

The simple truth is that we associate places with names (or even street intersections). I would say “McDonald’s near Elm and Queen Street”. I wouldn’t say ‘131 Elm Street’ or ‘23.2352, -115,234234′ Now to be able to do something like that, we need both the business data and the geocoder.

And so now in comes iBegin Source and iBegin Geocoder (launching soon). We already have support for linkage on iBegin Source - basically you link to that specific page, and we link back (right now you have to manually add the link, but we are working on a trackback system for that). Example: Best Vet Inc in Boynton Beach, FL.

We know the post is about XXX, we know that XXX is located in YYY - so now we know where all of this is.

The next challenge then is to introduce bloggers into this system. And that is where Blog Flux’s fantastic reach comes into play. Almost 31,000 blogs approved, and over 72,000 registered users. Throw in Blog Top Sites with another 30,000 members (50% overlap with Blog Flux), and you now have the potential to reach 87,000 users about this service (by the time we launch it should be 90,000). Blog Flux is going from strength to strength (just peaked at 45,000 pageviews a few days ago) - this will just push it further along :)

I’m not going into more details about how we are presenting the data and so forth (aha!), but this should give an idea on how having multiple established brands can be a good thing. Do remember that both iBegin and Blog Flux have their own staff, so it’s not like you can just setup two brands and enjoy. It takes time to do that too.

  • 0 Comments |