A while ago, we started work on a new homepage algorithm for Stack Overflow.
See phase 1 (with feedback), and motivations for some more detail.
We had gotten to a point where we were pretty happy with the unanswered questions in that algorithm, but the "interesting answered questions" weren't cutting it. Phase 2 is a new whack at that problem.
We're now mixing in a few explicitly interesting questions on https://stackoverflow.com/?tab=recommended&subtab=recent. I'd like some feedback on how we're doing.
You can tell a question has been denoted as "interesting" for you, because the question is a) answered, b) the answer has a score > 0, and c) the question has a score > 0.
Here's how it's working* now:
- Filter to recent quality questions
- Answered (has an answer with score > 0)
- Upvoted (effectively, score > 0)
- For each post, calculate how much of an outlier it is in the following metrics
- Question score / age
- Question score / length
- Question score / views
- ... and then the same three again, except for the answer
- Weight that outlier-ness by your tag preferences (explicit or inferred).
- Randomly choose from the weighted set of questions and mix them in.
A benefit of this approach is that it "tunes" to how active Stack Overflow is at that time.
Some examples of questions this algorithm has chosen for me (a mostly-C# developer) since it went live earlier today:
Why is a nested struct inside a generic class considered "managed"?
Boolean array initialization in C
What namespace will a class have if no namespace is defined
Entity FrameWork CodeFirst using context in Controller
Efficient BigInteger multiplication modulo n in Java
What is doing a good job for me (who mostly designed the algorithm) is basically the low bar. Now it's time to get some feedback. So, what do you think?
*Simplifying by omitting a lot of normalization here.