I feel the same way. I have officially started coming here again. Reminds me of the good ol' days. Keep up the good work, and try not to get off track this time :)
I would like to hear from digg staff about two issues:
1. Nice to see that digg staff are now regularly commenting, but how about digg staff pay a daily visit to the Digg API support group? Questions more than 10 days are waiting. In the past I have written emails to the community manager about the same issue, which was then passed on to others and for a short time, there were timely responses. Now back to square one. The API has a ton of things to be fixed and unless digg staff want to engage in a constructive dialog with people who use the API, it will just stay the same.
2. This issue has been brought up by so many people several times. I have raised it numerous times and have been told that the fix is very close. But some staff seem to not even recognize the issue. So one more time .....
Clicking the upcoming tab on the site and sorting by newest stories OR getting the same via the API shows stories submitted from about 10 minutes ago, with about one story a minute or so. All stories there will have at-least two diggs. But the fact is (you can see through the stream), there are much more stories submitted, but they are simply not included in the upcoming tab or the API. Is there a reason to select only a few stories and leave out a majority of them? Or is it a bug ( as agreed to by several staff in the past http://digg.com/news/technology/digg_brings_back_user_submissions_page/20100930160938:d607cf41f95f42ed91abf6c441775edb ) This bug has existed ever since digg rolled out the upcoming tab.
No matter what, Digg is surely on a steady progress since the V4 launch and soon we will be back to where we were!
Posts with every other line a return.
Links in the post.
Links flanked by some sort of symbol.
The same post posted 20 times in 5 minutes.
Talking about cheap fashion.
Dollar signs used frequently.
An account that was made 5 minutes ago.
It seems likek this should be a slam dunk to take care of with all the similarities most of these posts contain. If I'm wrong, let us know, but a lot of us think that this is a joke that it still hasn't been taken care of.
Combating spam is not a trivial endeavor and is a constant battle (as a coworker once said, "all you can do is put more furniture in front of the door"). Individually, the heuristics you listed could be potential indicators of spam but they are fuzzy and would lead to false positives. Combined and with differing weights they could be a fairly effective algorithm but the cost of running this algorithm in real-time as comments are posted is also non-trivial. Personally, I think bringing back the comment reporting functionality was a huge win. It doesn't stop spam from being posted but it lets us remove it much quicker.
That being said, this is a topic we're taking seriously and are working to address as quickly as possible. We've had a couple releases recently that contained some steps to combat this and we're monitoring the results.
By the way, another decent spam heuristic would be physical proximity of keys for characters in a username. That is, many spammers' usernames are just the result of mashing a single row of keys on a standard QWERTY keyboard. Now that I've mentioned it that heuristic has likely become much less valid. :)
1. Breaking means, something time sensitive. Typically, there would be no more than 10 real breaking news a *day* worth being highlighted.
2. Algorithm should decide what news is "Breaking" or not. If the algo cannot or does not want to do it, USERS should be able to vote what is really "breaking".
Without the above, and with what is displayed as breaking at this moment (2:23 PM CST) -- it appears to be a mere "Stories From Our Publishers We Want to Show in the Front Page". Again I stress, the stories I see now, suggest so. May be things will change later.
The upcoming list, like almost all stories lists, are generated every 10 mins. Otherwise, that query wouldn't return in enough time. (It is a massive sort). Furthermore, we limit the stories that get into upcoming by a size
The missing stories was a result of our log rotation and analysis. That dependency for upcoming was removed (~week ago).
We are making the upcoming section more interesting in the coming weeks.
Thanks for the swift reply. I see your technical reasoning for the filtering. But do you realize that you are effectively killing any submission made by a user who has no followers? Why should they even submit anything?
A submission from a user with no followers, can be seen no where -- but for their profile page. To my understanding, the upcoming section is to serve as a place to find stories beyond what our friends have submitted/dugg/commented.
Also, this approach makes it impossible for an API consumer to see the full list of stories submitted in a given time span (ex: past 24 hrs), unless they were 24/7 processing the stream. Also, an application missing any story from the stream again makes it impossible to be fetched.
Isn't the purpose of Digg is for users to decide what is popular & breaking ? Human intervention simply beats the entire purpose.Reminds me of something Kevin Rose said about slash dot a few years ago. Surely you can determine algorithmically what stories are breaking by the buzz they receive in a short time-frame from diverse sources... similar to V3's trending module.
Thanks for the feedback. The module is going to be used for not only picking breaking news but also collecting stories about a certain topic on a central place, that's what drive the development of this. I can say as the developer of this feature that those are what drove the decisions.
An interesting suggestion. However, our goal is to avoid having users deal with this in the first place. We have some work in progress that I think is going to dramatically help the situation. If it does not, crowdsourcing the stuff that slips through could be a viable option.
badserverNov 11, 2010Buried
I so Digg it. The Digg v4 experience is becoming interesting again. I hope you guys get the traffic back too.
minnullNov 11, 2010Buried
I've decided to come back on board recently since the site does seem to be heading back in the right direction. Save the community, save the site.
heidenreich12Nov 11, 2010Buried
I feel the same way. I have officially started coming here again. Reminds me of the good ol' days. Keep up the good work, and try not to get off track this time :)
l0nerNov 11, 2010Buried
I love the load more button that loads no more comments.
ltgenpandaNov 11, 2010Buried
Thanks for the regular roll-out of new features.
I would like to hear from digg staff about two issues:
1. Nice to see that digg staff are now regularly commenting, but how about digg staff pay a daily visit to the Digg API support group? Questions more than 10 days are waiting. In the past I have written emails to the community manager about the same issue, which was then passed on to others and for a short time, there were timely responses. Now back to square one. The API has a ton of things to be fixed and unless digg staff want to engage in a constructive dialog with people who use the API, it will just stay the same.
2. This issue has been brought up by so many people several times. I have raised it numerous times and have been told that the fix is very close. But some staff seem to not even recognize the issue. So one more time .....
Clicking the upcoming tab on the site and sorting by newest stories OR getting the same via the API shows stories submitted from about 10 minutes ago, with about one story a minute or so. All stories there will have at-least two diggs. But the fact is (you can see through the stream), there are much more stories submitted, but they are simply not included in the upcoming tab or the API. Is there a reason to select only a few stories and leave out a majority of them? Or is it a bug ( as agreed to by several staff in the past http://digg.com/news/technology/digg_brings_back_user_submissions_page/20100930160938:d607cf41f95f42ed91abf6c441775edb ) This bug has existed ever since digg rolled out the upcoming tab.
No matter what, Digg is surely on a steady progress since the V4 launch and soon we will be back to where we were!
cdurukNov 11, 2010Buried
Glad you are back!
robbh66Nov 11, 2010Buried
Dear Staff,
How hard can it be to get rid of the spammers?
Posts with every other line a return.
Links in the post.
Links flanked by some sort of symbol.
The same post posted 20 times in 5 minutes.
Talking about cheap fashion.
Dollar signs used frequently.
An account that was made 5 minutes ago.
It seems likek this should be a slam dunk to take care of with all the similarities most of these posts contain. If I'm wrong, let us know, but a lot of us think that this is a joke that it still hasn't been taken care of.
Thanks.
superman101Nov 11, 2010Buried
I am so jealous.
dajobeNov 11, 2010Buried
Don't break it
richidNov 11, 2010Buried
Combating spam is not a trivial endeavor and is a constant battle (as a coworker once said, "all you can do is put more furniture in front of the door"). Individually, the heuristics you listed could be potential indicators of spam but they are fuzzy and would lead to false positives. Combined and with differing weights they could be a fairly effective algorithm but the cost of running this algorithm in real-time as comments are posted is also non-trivial. Personally, I think bringing back the comment reporting functionality was a huge win. It doesn't stop spam from being posted but it lets us remove it much quicker.
That being said, this is a topic we're taking seriously and are working to address as quickly as possible. We've had a couple releases recently that contained some steps to combat this and we're monitoring the results.
By the way, another decent spam heuristic would be physical proximity of keys for characters in a username. That is, many spammers' usernames are just the result of mashing a single row of keys on a standard QWERTY keyboard. Now that I've mentioned it that heuristic has likely become much less valid. :)
ltgenpandaNov 11, 2010Buried
Really nice feature. But ...
1. Breaking means, something time sensitive. Typically, there would be no more than 10 real breaking news a *day* worth being highlighted.
2. Algorithm should decide what news is "Breaking" or not. If the algo cannot or does not want to do it, USERS should be able to vote what is really "breaking".
Without the above, and with what is displayed as breaking at this moment (2:23 PM CST) -- it appears to be a mere "Stories From Our Publishers We Want to Show in the Front Page". Again I stress, the stories I see now, suggest so. May be things will change later.
justatoolNov 11, 2010Buried
I really hope so too. I remember the time when I saw the most dugg story on digg..it was 50k!
cdurukNov 11, 2010Buried
It's not exactly the same, these are hand-picked. However, we have some new features so keep watching this space :)
mastersaiyanNov 11, 2010Buried
great news..
k so i have a feeling im back at v3 with new design..and less traffic lol :D but going good ! keep it up digg
superman101Nov 11, 2010Buried
The upcoming list, like almost all stories lists, are generated every 10 mins. Otherwise, that query wouldn't return in enough time. (It is a massive sort). Furthermore, we limit the stories that get into upcoming by a size
The missing stories was a result of our log rotation and analysis. That dependency for upcoming was removed (~week ago).
We are making the upcoming section more interesting in the coming weeks.
ltgenpandaNov 11, 2010Buried
Thanks for the swift reply. I see your technical reasoning for the filtering. But do you realize that you are effectively killing any submission made by a user who has no followers? Why should they even submit anything?
A submission from a user with no followers, can be seen no where -- but for their profile page. To my understanding, the upcoming section is to serve as a place to find stories beyond what our friends have submitted/dugg/commented.
Also, this approach makes it impossible for an API consumer to see the full list of stories submitted in a given time span (ex: past 24 hrs), unless they were 24/7 processing the stream. Also, an application missing any story from the stream again makes it impossible to be fetched.
confuciussayNov 11, 2010Buried
Isn't the purpose of Digg is for users to decide what is popular & breaking ? Human intervention simply beats the entire purpose.Reminds me of something Kevin Rose said about slash dot a few years ago. Surely you can determine algorithmically what stories are breaking by the buzz they receive in a short time-frame from diverse sources... similar to V3's trending module.
cdurukNov 11, 2010Buried
Thanks for the feedback. The module is going to be used for not only picking breaking news but also collecting stories about a certain topic on a central place, that's what drive the development of this. I can say as the developer of this feature that those are what drove the decisions.
kwcarpenterNov 11, 2010Buried
That makes me miss Heroes. Back when it, also, was good.
richidNov 12, 2010Buried
An interesting suggestion. However, our goal is to avoid having users deal with this in the first place. We have some work in progress that I think is going to dramatically help the situation. If it does not, crowdsourcing the stuff that slips through could be a viable option.