Sunday, December 26, 2010

The Puddybud Series: Who Started the Name-Calling?

Short answer: Puddybud did.


First, a quote from old #2 himself:

Puddy was the recipient of name calling WAY BEFORE Puddy called people names.
Yes he's said this many times. We liberal leaning folk started name-calling him. He was provoked into being such an OCD name-calling troll.

Sorry Puddybud. This is not true. Let's look at your very first comment at (as Puddybud, anyway), way back in April 2005:

Hey Goldy: You never answered my question I posted to you on the wonderful SoundPolitics blog. Butt as you are, no guts to answer.
Notice the emphasis. Puddybud continued this "butt" fetish and it didn't go un-noticed by the faithful. Motivated by the obnoxious commentary of this newly minted troll some started calling him "Puddy-butt". Why spell the conjunction "but" in this fashion? Well, Puddybud eventually answered:

The Butt is my reference to you all, especially Armageddon. He/she/it has many friends down here in the San Francisco area. In fact isn’t Armageddon the San Francisco treat?
Well, there it is. He's calling everyone who doesn't care for his views a "butt". However(!), return to his first comment - "Butt as you are" - this can also be read as calling Goldy a "butt".

Of course Puddybud might claim that Goldy was the original name caller, calling Tim Eyman a horse's ass (a statement of fact in my opinion), or he might restate that he commented on earlier under another name (for which he's never offered a link as proof and likely never will) and was name-called for his trouble.

In any case, the evidence remains quite strong that Puddybud started the name-calling right out of the gate and it's little wonder that people have reacted to him negatively ever since.

Your move Puddybud.

Update: Puddybud denies calling Goldy a "butt" and still claims victim status.

Tuesday, August 10, 2010

Tuesday, August 3, 2010

DB Snapshot Demo

Enjoy.. Blow it up to full screen to see it better.

Update: small bug found in the snapshot code. As of my last run, the db has 238 comment records over 8 articles in August. Before it was just importing the same 168 records.

Thursday, July 29, 2010

The Junkshot Series: Let's talk about Reckless

This post will begin a series on HA's most obsessive and unrepentant troll: junkshot, my name for the troll you all know as Puddybud. We'll start at the margins and move in deeper. This will be an occasional series touching on stuff I find mostly laugh out loud hilarious, otherwise I'm sure you'll all agree that junkshot is a pretty tedious character. Focusing on individual trolls is somewhat of a distraction from the original goals of illustrating collective troll behavior through data mining but junkshot is the second most prolific commenter in the HA comment threads after Roger Rabbit, easily the most prolific troll and it'd be a disservice not to focus on him a bit.

So let's begin. Who or what is "Reckless"? Here's a noteworthy exchange between myself and junkshot:

302. YLB spews:

300 – LMAO @ you moron. You got nothing on me with names. MWS? Kingbud? Reckless? There were others for sure.

305. Puddybud, Hey it's the new year... spews:

MWS? Kingbud? Yep. Prove I was Reckless. Go on fool! 
Very interesting. He readily admits to being MWS (aka Mike Webb Sucks, kind of hard not to do that). And he readily admits to being Kingbud (I don't recall if he was ever outed by Goldy or Darryl or anyone else on that one) but.. he balks at Reckless. Why? But I digress. Let's first oblige him and prove it. Here:

37. Puddybud spews:

Moonbat!s I created Reckless to prove no matter how you speak to a Moonbat!, nasty or nice, they are nasty.

Tried being nice with Reckless.

John: I tried good discussion. That was my Reckless moniker.
BTW John, when I posted nicely without swearing as Reckless, without antagonism, without innuendo, just posting facts, it fell on deaf ears.
Of course we'd be remiss to leave out Goldy's outing:

13. Goldy spews:

Reckless/Mike Webb Sucks/Puddy/etc. @4,
Perhaps if we added together all the comments coming from your various aliases, you might turn out to be one of HA’s heaviest users?
Heaviest users? Indeed. He's the #2 heaviest user. But enough ok! Puddybud (junkshot) is/was Reckless..  Why do I have to "prove" it? Why couldn't he just admit he's Reckless along with MWS and Kingbud? Let's hold that thought for a moment. Let's look at Reckless' legacy at HA.

First of all, junkshot has this to say about Reckless:
Moonbat!s I created Reckless to prove no matter how you speak to a Moonbat!, nasty or nice, they are nasty. Reckless called everyone by their real name. Reckless was polite.
Reckless asked legitimate questions.
Reckless never challenged Goldy’s scratchy voice.
Reckless never challenged Goldy’s Koro’s disease.
Reckless never called YLB - Clueless.
Reckless never challenged FroggyASS.
Reckless never swore at Carl Grossman.

Well, well, how "nice" of Reckless. Let's see how much of that resembles the truth shall we? Reckless produced 111 comments between 1/13/2007 and 2/1/2007.

Reckless was polite? Here's a snip from his FIRST comment in response to concern troll Thomas Trainwinder:

Hypocrisy. I thought that was the Democratic mantra.

Nice dinner party guest he'd make. Polite? Questionable to say the least. Obnoxious? Right out of the gate.

Reckless never called YLB - Clueless. Hmmm.

Someone once blogged you wrte like Clueless. I’ll go one better, YOU ARE CLUELESS

Someone huh? No. I guess Reckless never "called" yours truly that. He merely spelled it out in caps. Classic right wing dodge.

Reckless never challenged Goldy’s scratchy voice. Oh?

I love pulling on Voice of Scratching Chalk’s chain.

No he pulled on his chain.. Let's see what else?

Reckless never swore at Carl Grossman.

You on the prowl Carl? I never heard of adults on myspace before unless they are predators!

Interesting.. How nice of Reckless not to swear. He merely "challenged" Carl with innuendo about predatory sexual behavior. (Didn't junkshot say no innuendo from Reckless?)

And here's Reckless trying to be "nice":

Roger: Post #2 isn’t funny? It’s not even hyperbole. Try not to politicize everything? It makes you look dumb. On second thought, you work hard on that all by your lonesome!

later same comment..

I realize you are a rabble rouser, but many times your tenuous use of words is silly, Rabbit! I’ll give your one: Silly Rabbit must be getting tired. Why doesn’t the Silly Rabbit go to sleep for a year!


Richard@44. I like it. Since Moonbat!s inhabit King County, PERFECT! Again I ask why won’t King County Moonbat!s demonstrate your support for Ron Sims, sell your cars, give to the poor, and ride those bicycles to work?

Nice indeed!

And we'd be remiss to overlook that Reckless felt a special affinity with you-know-who!

Puddy: Clueless is YLB?
Tell me more!

Being deceptive. Very nice. Did Kingbud circle-jerk with his master? I'll have to check one of these days.

So why did junkshot ask me long ago to "prove" that he was Reckless? Guess we'll never know for sure. You can ask till you're blue in the face and junkshot won't tell. If anyone has any special insight, feel free to leave it in comments.

Reckless indeed was a notable tactic of junkshot/Puddybud's "top-kill" strategy. Pretty much forgotten but now immortalized in this blog post. Again - someone's got to do it.

Next up in the series: Who called who what names?

Monday, July 26, 2010

Yet More Tagging Progress!

The 319 handles in the "between 100 and 999 comments" bracket have been tagged. Learned many interesting things.

About a third of the handles were pretty obvious names that were familiar to me. They took about as much time to tag as the over 1000 group.

However the remainder were kind of forgotten to me and I had to flip between my tag file and a little query script to scan their comments to answer the question - troll or non-troll?

It was a longer slog than I cared for but... It was a pretty interesting bunch of comments.

Some comments came from single-issue type people who impressed me with the depth of their knowledge on their pet issues - everything from the ferries to Sound Transit to whatever was the issue of the day at HA.

Some were from lefties that had little regard for the Dems and predicted (fairly accurately) that people on the left were getting their hopes up way too high for the Dems to deliver any appreciable change.

Some comments came from righties (a few, mind you) that believed more or less the same thing and were pretty disgusted with how the Republicans had screwed up things. I could not in good conscience call these righties trolls.

There was even one right winger who impressed me as being genuinely interested in putting forth an intellectually sound and nuanced argument. One. Just one! This winger was sounding out a lefty for his personal opposition to abortion. I made a note to myself to read the entirety of this right winger's comments.

See trolls? You can get some positive attention from us if you quit the name calling, turn off the right wing hate radio and other degenerate propaganda and think deeply through your positions on the issues.

All in all it was somewhat tedious but rewarding work and at this point I'm 84.5 percent of the way through tagging the entire comment database.

The next bracket, the "between 10 and 99 comments" bracket has a touch over 1400 handles in it. Already started with the ones that are most familiar. We'll see how it goes. Again, after that, I'll have the comments 94 percent tagged.

Beyond that? Well "the swamp" (almost 14,000 handles!) has some interesting critters in it indeed. I won't ignore it. Trolls, be advised: you can run (or swim) through "the swamp", but you can't hide.

Lastly there's the job of sifting through handles that have switched between troll and non-troll identities - separating the troll from the non-troll to even the spam comments wrapped under one handle. I started sketching out a user interface to a web app that will help with that and the whole tagging, typing chore to boot.

I've pretty much settled on the Sinatra framework for the web app. Gonna be a whole lotta fun!

Monday, July 12, 2010

Tagging Progress

Over the weekend I took the most comment-prolific 81 handles, grouped the contained aliases under a single tag (e.g. all of Puddybud's various handles under the tag "junkshot") and categorized the tag as troll or non-troll.

Remember that those top 81 handles have contributed 1,000 or more comments each and in total account for over 62 percent of all comments.

Conclusion: 30 percent of those comments are troll comments. That number will probably hold pretty steady as I tag the next 1700 or so most comment-prolific handles. And after I'm done with that?

Just a hair over 94 percent of all comments will be judged troll or non-troll.

Of course the work is never done. Some handles like "John" or "Bill" or "Steve" have exhibited either troll or non-troll character throughout their lifetimes. The troll comments contained under those handles will have to be laboriously sifted out and assigned their own special tags.

Wednesday, June 30, 2010

One for the Ages

“If your faith conflicts with reality, ignore reality.”

A distilled teabagger and HA troll maxim.

This blog cannot be about just the trolls. It should also from time to time salute the heroes.

Thanks to relative newcomer, Deathfrogg.

Thursday, June 17, 2010


Been a while since the last post. Still tagging handles, still removing warts from my code. The latest code is almost where I want it. I could do a bit more factoring, removing some repetition but it would be more for aesthetic purposes. The code does what I want with satisfactory performance. Test coverage as is usually the case could be much better.

On a more relevant note, April and May 2010 are now in the database so I thought I'd update the big picture that I first sketched here.

As of the end of May 2010:

8,082 articles over the 73 months of's existence.

430,912 comments. (Haven't filtered out the spam yet.)

and last but not least:

Out of 15,728 unique handles!

Top 10 # of comments by handle:

10 GBS @ 5,317
9 rhp6033 @ 5,478
8 ArtFart @ 6,004
7 Steve @ 6,167
6 Marvin Stamn @ 6,774
5 Daddy Love @ 8,580
4 YLB @ 9,323
3 Mr. Cynical @ 9,827
2 Puddybud @ 11,369

and number one?

Our beloved Roger Rabbit at 58,310.

So Steve moves ahead of Art, gaining on Stamn. Puddybud barely budges due to begging everybody else to come to me to justify his drivel. (Useless, he'll ALWAYS be #2.) All the usual disclaimers on that top 10 list still apply.

An updated brackets report follows:

One handle has joined the 1000+ comments club. Right now I don't know who you are but congrats!

Update to "Update": The newest member of the 1000+ comments club is "John" which is a handle that's been used by many, many people, hero and troll alike, over the years. Again, congratulations "John"!

Friday, April 30, 2010

Entering the Twitterverse

Tagging handles in earnest indeed. In the meantime, I may tweet a few eureka moments about code or trolls.

While I was at it, I added an rss feed.


Monday, April 26, 2010

Still Distracted but Turning a Big Corner

In the last post I was all jazzed about an opportunity to make the data collection from my "select sources" run much faster.

Well, I'm pretty much at the other end of it and I'm quite happy about it. I did a major rework of the code. It's much better than before and my only regret is that I wish I'd thought of this speedup sooner.

The old code base accomplished the collecting (almost 8,000 articles and 415,000 comments) but it did the job kind of slowly and that was even with some good speedups I had spotted early. But if I had had what I've got now, it would have gone much faster. Also I would have totally minimized data collection from HA itself. Oh well. Better late than never.

In the last post I mentioned that I used code that would automate a web browser to navigate the web and collect HA articles from select sources. I also mentioned that by bringing up one browser instance and feeding requests to it, you will maximize the data collection performance.

The obvious speedup of course is to distribute requests across more browser instances - make it scale. But now we're talking about forking off objects from classes that encapsulate the browser-driving code. And once you create sub-processes you need a means to communicate requests to and retrieve results from those forked instances.

The tried and true method to accomplish this communication is to use a message queuing daemon. In a previous Ruby project I'd used a pretty slick one with a handy Ruby support library. No problem mon -  it's just a matter of finding the right places to insert this code. It was a lot of work but it made me take a hard look at the entire code base and throw away a lot of stuff that just wouldn't be needed anymore. And as I separated out code into Ruby modules by function, I was forced to come to a better understanding of instance variables. In my Repo class I totally overused class methods and was passing way too many parameters from method to method. Why? Way back in the beginning I thought it would be easier to test if I did it that way. Totally wrong. Thanks to this rework, the code in this project and my other projects will be so much better.

One of my select sources in the past really made me tear my hair out. I had only collected 4 or so articles from it because of all the trouble it gave me but now that's all changed. I've gotten much more than a handful from that source for March 2010 alone and I expect it to only get better.  And there's yet another select source out there that I didn't even bother to try because it was so weird and different from the others. I'm feeling pretty confident now. I will give it a shot but I have to do a couple things first.

I'm going to start tagging handles in earnest. This is really critical to separating the trolls from the heroes and then being able to make big picture analyses of troll behavior over the months and the election cycles.

Then I've got to scratch an itch about current month activity. So far I'm caught up to the end of March 2010. Normally I wouldn't add a month to my collection until 10 days into the next month. Why? Goldy doesn't shutdown comments to an article until 8 days or so after the article is published. So if I'm curious about some interesting activity in the current month, I have to develop a way to create a snapshot database with an eye on merging various snapshots into the real thing to save time when a month finally closes.

Wednesday, April 7, 2010

Opportunities, Distractions

One goal of this project was to  minimize data collection as much as possible directly from itself. I would have been a lot farther along much sooner if I didn't care about this. Why is this important?

I not only want to slice and dice troll behavior but I also want to learn all these awesome cool ruby-based software libraries. They've been a total joy to learn about and work with.

The data collection problem and the composition of its solution into classes has been pretty simple so far. There's a Session class whose job is to collect one month of's existence. The most important instance method of this class is add which at its heart is a for loop which starts at the first article of the month and ends at the last. Each article of course contains the comments which is the gold I'm after. But it's within the loop that the fun really begins.

I said I wanted to minimize data collection from the well, To do this I constructed a least cost route of sources from which to draw articles with being the highest cost, the last resort. So I have a Repo class (short for repository) that's blessed from a Session instance. I feed it an article number and a source and it tries to get the article. If it comes up short then the Session instance tries the next source in the route.

So far pretty simple. Not rocket science. Fun? Only for a geek, right? Well it gets even better. Some of my sources require calling a simple http get which is available from lots of ruby http libraries. This is how I get stuff, again as a last resort, from Butt simple and too boring. But a few sources, I call them select sources, require something a little more sophisticated. They require driving a web browser to go and get the article. Without getting into too much detail:

Making this go was just too much freaking fun for words.

So, after fist pumping and watching my code drive a web browser to navigate the web and get the stuff I wanted from it I quickly realized I'd done something a little dumb:

My code was opening and closing the web browser for each request to the web.. Ugh.. It's a lot faster Sherlock to just leave one browser instance up and feed all the many requests you want to it and then close it when the session is concluded. Another hour or two of searching for the right spot in the code to insert the needed changes and I had a big increase in performance.

Which is all a long winded introduction to the point of this post - there's yet ANOTHER opportunity (and yep, distraction from my initial goal) to make this puppy run even faster. Next post. I'm gonna code.

Tuesday, April 6, 2010

QA winds down. HA's most commented posts.

To kind of, sort of check on the quality of the data collection process, I whipped up a report of the most commented articles of each month of HA's existence. Then here and there I went to the well to see if what I have and what HA has matches up..

I've found tiny variances in, strangely enough, both directions.. I'll definitely have to look more thoroughly at those months where I seem to have MORE than what Goldy has. But so far the variances seem to be made mostly of spam.

Once I had that report working, I tweaked it to select one article at random from each month. Again, things seem to match up just fine. Even better than the most commented articles sample.

So confidence is high.

Anyhow the most commented posts merit further mention. Here's a pretty picture:

The first column is the year/month, the second is the article number and the third is the total number of comments for the thread. The most commented HA blog post, in March of 2005, was a knock-down, drag-out thread about Terry Schiavo - a major body-blow for the extreme right wing. Between trying to weaken Social Security, using Terry Schiavo's vegetative body as a prop, letting New Orleans drown, corruption/sex scandals, torture, dollar black hole wars of choice and the final, ugly meltdown of the economy - let us never forget what it means to have the right wing in control of this country.

Monday, April 5, 2010


One half of one percent of commenter handles, 80, belong to an elite group. They are handles each associated with over 1000 comments posted throughout the lifetime of I posted the top 10 a while back.

Browsing the list of 80.. All the names are quite familiar to me, troll and hero alike, and as the picture above shows, they account for over 62 percent of the comments posted. They represent's hard-core community of political junkies and hanger-ons and they include the most dedicated and unrepentant of right wing trolls.

A somewhat broader community of 306 handles have posted somewhere between 100 and 999 comments each but they only account for a little over 22 percent of all comments. Quite a few of them may one day soon pass the 1000 mark. Many of them enjoy participating in the HA brouhaha but are either fairly new to it, have dropped away or HA is far from a priority for them.

A modestly more numerous bunch have posted between 10 and 99 comments. I speculate that many of them are again new to the community or they may have started, felt lost in the crowd (or disgusted) and then quickly dropped away.

Lastly there's a very interesting group to me. They account for over 88 percent of the 15,078 handles recorded and yet each handle accounts for fewer than 10 comments over the life of the blog. Quite a few of them if not most of them for sure are just spammers.  Many others are just hit and run commenters, responding to a provocative blog post that's gotten some traction in the wider blogosphere and elsewhere.

But this group also contains some of the most vicious, nasty and vindictive trolls of all.

This group includes the HNMT and members of The Hit Squad.

Late thought: the big lesson to draw is that if I concentrate on 1,748 handles and tag them, then I've in turn categorized over 94 percent of the comments. Pretty useful little report. And I just picked the brackets on a hunch.

Bit of an update: The 88 percent group of handles mentioned above also includes a lot of people making fun of the trolls by doing a variation on their handle. Goldy calls his comment threads "the cesspool". Sue me, I'm a part of it. But this 88 percent group of handles which accounts for less than 6 percent of all the comments I call "the swamp". There's some fun to be had there but only as long as you don't wade in for too long.

Saturday, April 3, 2010

Tagging Handles

Trolls and Heroes adopt many aliases over the course of their on-line existence. To separate them from one another and make searching and analysis easier I'm going to stamp each comment with a tag. No not all umpteen thousands of them - just associate the 15,078 handles (and growing) with a more coherent tag e.g. puddybud for all the versions of puddybud's handle and that will in turn stamp his legacy of over 26 thousand comments.

The tags will live in a separate file, a hash whose key is the tag and whose value is a category:

h - hero
t - troll
ul - a lefty who hurts the cause by getting too much like a righty.
rr - a right winger whose tone is reasonable most of the time. (Very rare at
s - spammer

In other news I'm torn a bit about using Sinatra over Rails for the QA portion of this project. Rails 3 is worth a look because it's new and much more flexible. I'll probably use both for portions of the remainder of this project.

After all most of the motivation and benefit of this project has been from learning and applying all these wonderful ruby-based software development frameworks.

Late update: Do you believe in synchronicity? I just found this out minutes before I published this post. The author of the command/tasking framework that I use to develop my reports on this project is heavily motivated by the concept of machine tagging which is used by Flickr and other projects. For now I'm sticking with the simple tagging implementation I've sketched here but I may have to revisit this more thoroughly later.

Thursday, April 1, 2010

MTR's blog

When bet-welshing troll Mark the Redneck arrived on the scene, he bragged he had a blog. I remembered him crowing about it and I had even read it once or twice but I forgot the url and could never find it again through google searches..

Well here comes our database yet again to the rescue. The blog probably went quiet shortly after MTR's boast but its ghost haunts our dim memory through the wayback machine:

For the most die hard troll aficionados only..

Tuesday, March 30, 2010

An Idea for a Rails QA App forms...

I've collected all the data up to the end of February 2010 and had some fun running a few reports. Now what?

There's the QA process. Do I really have all the data or do I just think I do? How do I exclude what I don't want, i.e. spam?

How do I separate the heroes from the trolls from the spammers? That's the most important question because this project is about troll behavior and having fun dissecting it.

The work of collection was a month by month process. The frontpage contains a comprehensive list of the archives by month and the number of articles for each month. So I'm thinking right now that a good QA process' work will have to be divided the same way.

So first I see a selection toolbar at the top of the browser viewing area with a year/month and  backwards/forwards arrows. At the bottom center right of the screen a radio button group with  hero, troll, spammer selections and a submit button. The rest of the screen will contain a sampling of comments from the handle owner.

So at a glance, I can come to a decision, is this a hero, a troll or a spammer? Once the decision is made the comments and the handle can be marked and then excluded from further year/month examinations.

And of course there's all the handles whose status I already know up front. No need to look at those. Roger Rabbit's stuff alone accounts for 11 percent!

So that's good enough for pass 1.

Pass 2 is then first a matter of submitting some spammer's spam for further examination to Akismet. Why? A hero or a troll's handle may have been hijacked by a spammer for a time. Submitting the comments to Akismet can easily mark what is spam and what is not.

Next is the hardest part.. A handle like Jane Doe may at one time be a progressive from Tacoma and at another time a troll from Bumfuck, WA. That's going to take a good amount of work but of course if we're smart, the work could be minimized.

15,078 handles.. Don't remind me of the comments. Someone's got to do it.

Monday, March 29, 2010

The HNMT revealed...

Here we see a screenshot of a pivot table I cooked up to summarize comments by handle in 2009. I looked for handles that had either "Roger" or "Rabbit" in them, i.e. Roger|Rabbit, not being confident that the bash command line interface would tolerate the space between the first and last names in one shot..

Well lookee there in the leftmost column.. (Click on the picture to super-size it.) A gaggle of putrid handles by that most resentment-driven and odious of trolls - the hateful name morphing troll, the HNMT.

In contrast, take a look at Roger Rabbit's line. He has misnamed himself. He's not a rabbit, he's a tortoise and the numbers peppered around the rabbit's line? Not a rabbit either, but to be sure a most bothersome pest.

I noticed something else.. In 2008..

Again we see Roger's solid row, this time in the bottom third of the report, and again we see the obnoxious pest buzzing about.. Activity seems to build to a frantic pitch indeed in May of that year and then... There's a very curious shutdown in pest vigor. What happened? Illness? Personal crisis? Boredom? Resignation at soon to be President-Elect Obama's sweeping victory in November? Or perhaps some other unhealthy fixation?

Another mystery to solve..

It is accomplished...

As of the end of February 2010.

7,680 articles over the 70 months had been to that point in existence.

414,965 comments. (Haven't filtered out the spam yet.)

Whew! A mother lode of right wing inanity and foolishness. With some intelligent and fun liberal-leaning comment thrown in..

and last but not least:

Out of 15,078 unique handle names!

Top 10 # of comments by handle:

10 GBS @ 4,952
9 rhp6033 @ 5,016
8 Steve @ 5,582
7 ArtFart @ 5,839
6 Marvin Stamm @ 6,774
5 Daddy Love @ 8,246
4 YLB @ 8,678
3 Mr. Cynical @ 9,314
2 Puddybud @ 11,368

and number one?

Our beloved Roger Rabbit at 56,751.

Commenter's various aliases are not yet factored in. Steve might be several different "Steves". (And the HNMT is a particularly nasty case.) It's going to require quite a bit of study. And I first have to do some qa on the collection process as a whole.  So this post will be updated as the picture gets clearer..

But so far, so good! Fun, fun, fun on the runup to November!

Special note to that most moronic of trolls (#2? It fits!):

$ du -h .ha

201M    .ha/data