Thursday, March 24, 2005

Personalization is hard. So what?

Philipp Lenssen lists some problems with doing personalized search. Boiling it all down, Philipp is saying that personalized search is hard to do well and can't ever be perfect, so it's not worth doing.

He's right that personalization is hard. It has to work from noisy, sparse information. It has to deal with changing preferences. It has to avoid pigeonholing. It has to make good predictions. And it has to do it all in real time for millions of users.

And he's right that it can't ever be perfect. Personalization is so hard that it's going to make mistakes. Probably a lot of mistakes.

Does this mean personalization is useless? Of course not. Personalization doesn't have to be perfect. It just has to be good enough.

If personalization helps people find what they need faster on average, it's a win. It doesn't have to be right all the time. It just needs to be helpful.

Take one of the examples from Philipp's critique, someone who searches for the single word "restaurant". If those results aren't targeted to a best guess at your location, they're completely useless. Try it on Google and look at the top results. In the vast majority of cases, it would be more helpful to emphasize your local restaurants than return the generic results. Personalization would be helpful.

Or let's take Amazon.com's personalization. Amazon's personalization is far from perfect, but it doesn't have to be. A generic storefront emphasizing top sellers is much less useful to you than a storefront emphasizing mostly products you like. When Amazon's personalization guesses wrong, it shows you something you didn't want, which is what the generic storefront would have done anyway. In general, their personalization is helpful.

Personalization is hard. Personalized search will make mistakes. And that's okay. It doesn't have to be perfect. It just has to be helpful.

8 comments:

Andy Harbick said...

With personalization you even have to deal with spelling mistakes! Imagine that. Spelling mistakes! (your query link is for "restuarant"

Greg Linden said...

Oops! Thanks, Andy! I fixed it.

Joe Goldberg said...

When Amazon's personalization guesses wrong, it shows you something you didn't want, AND gives you the opportunity to correct it (via the "Why was I recommended this?" link).

Greg Linden said...

Great point, Joe. Explaining why a recommendation was made is a great feature. It can make a seemingly spurious recommendation seem reasonable, even accurate, as long as the explanation makes sense. Providing an opportunity to correct the recommendation gives the system more data, allowing it to be more accurate in the future.

Our team at Amazon enjoyed building that one. Fun project.

It's worth noting that not all personalization algorithms can provide explanations of their predictions. Something to think about if you're trying to do personalized search or build a recommendation system.

You might not have realized this, but Findory.com also has explanations. Click on the sunburst icon next to any recommended article on the front page or in search results. Findory will explain why it thinks you might be interested in that article.

Philipp Lenssen said...

Greg, in my post you refer to I tried to differentiate between problems which are just hard to solve (in technical terms) and those which are posing new deep problems which are impossible to solve, or go against the nature of search. For example, "4. People like to share", "5. People don’t like everything about something", "6. People search for others", "8. People don’t know what they want in advance", "9. People aren’t stereotypes". These aren't trivial problems which can be solved with technical means. These are deep-running issues to which personalization would pose *new* problems.

As for Amazon, they are using collaborative pattern matching to show related products. This is great. This is not the kind of personalization I was talking about. This is matching patterns: John buys A and B, Frank matches John by buying A, so Frank might like B and will get this suggestion. This is for the PUSH part of Amazon; they are suggesting products to you. But I was talking about search engines, which satisfy the PULL part. PULL on Amazon is when you search for something. How bad would it be if the product search would only show things Amazon thinks you "like" for whatever reasons based on your shopping history!

Greg Linden said...

Hi, Philipp. Thanks for bringing up all these issues! I think this is a great discussion.

I think the core question here is whether personalization can be used to improve relevance rank. If I understand you correctly, you're saying changing the relevance rank based on individual search and click history would always reduce search quality. I'm saying that I think there's a lot of cases where it could improve search quality if done right.

It's for me to hard to believe that the default relevance rank is the best it can get. When the generic relevance rank is computed, it makes a lot of assumptions, averages against a general population, and deals with noisy and incomplete information. More information should allow the results to be more relevant.

Let's take your Amazon search example. Try a search for "cookbook". If you've bought a few cookbooks on Chinese cooking, wouldn't those search results be more useful if they had a few books related to Chinese cooking sprinkled in there? Clearly, you wouldn't want to show only books on Chinese cooking in this case. But there's some balance where a modest reordering to favor a couple books on Chinese cooking might be helpful.

The goal is to help people find the information they need. If using different relevance ranks for different people helps them find the information they need, that's what we'll have to do.

Philipp Lenssen said...

If I search for "cookbook" at Amazon, and I previously bought Chinese cookbooks, then this could mean different things:
- I want to make a gift to my sister, so I'm basically searching for her
- I want to specifically find different cookbooks because I know how to cook Chinese now
- I want to get more Chinese cookbooks

Simply entering "Chinese Cookbook" or similar solves the ambiguity, as is trivial to do. Any other way is neither trivial nor always *possible* in the first place.

But let me focus on merits of personalization, so you can see my whole point: I think Google News (possibly, Findory) customization is a great thing. Because their front-page was customized in the first place, only by someone *else*. And because it makes its customization explicit, thus empowers me to choose. While it is connected to problem 1 "People change", and 8, "People don’t know what they want in advance", it is still a bonus. This is the kind of problem where your statement fits: "Personalization is hard. So what?" Indeed, here it's worth it, as the gain outweighs the problem!

Again, would I want them to personalize my Google News *searches* based on what I customized, what I clicked on, etc.? No! I want to see the whole thing as objective as possible. I do also not want to buy cyber-glasses one future day which hide the things in the world I find ugly, and paint things in my preferred colors. I want to get the whole picture.

What I like about Google News customization at the moment is that you're still getting the world news, always. That's a sort of compromise I find pragmatic. It's guarding against some of the problems, like the fact you don't always know what you want in advance.

Findory is somewhere inbetween a complete-control customization, and an implicit one, if I understand it right. And I'm somewhat undecided. I think it's a great idea, like Google News customization for the lazy. Merits here largely depend on the technical implementation (actually understanding the reader) which might be less trivial than "John likes A -- A is topic 1 -- B is 1 as well -- give John B", and I don't know if it's harder to overcome these issues than to just rely on human-configured settings, which could be manually updated by the user every half year or so. But as you said... it's hard, so what!

Greg Linden said...

Great points, Philipp. And thanks for the kind words on Findory. Glad you're enjoying it!

I think the core of our debate comes down to when you said, "No! I want to see the whole thing as objective as possible."

I don't see generic relevance rank as particularly objective. The programmers put a lot of assumptions and educated guesses into it, mostly to deal with incomplete, imperfect, and noisy information about importance and relevance.

Now, if you do see relevance rank as objective, I can see how the idea of fiddling with it would be unappealing. After all, you'd be taking something that is clean and precise and mucking it up with personalization goo.

But I don't think relevance rank is objective, clean, or precise. The way I see it, if more information can make your search results more accurate and relevant to you, we should use that information.