Analyzing ~425 days of Hacker News posts with standard shell commands

(About) 425 days ago (at the time of this writing) I started scraping Hacker News via its shiny new API. And then I promptly forgot about it. That is, until I noticed my cronjob had been throwing errors constantly for a few weeks:

Traceback (most recent call last):
  File "/home/dummy/projects/hn-cron/hn.py", line 62, in <module>
    main()
  File "/home/dummy/projects/hn-cron/hn.py", line 53, in main
    log_line = str(details['id']) + "\t" + details['title'] + "\t" + details['url'] + "\t" + str(details['score']) + "\n"
KeyError: 'url'

Instead of fixing anything, I just commented out the cronjob. But now I feel somewhat obligated to do at least a rudimentary analysis of this data. In keeping with my extreme negligence/laziness throughout this project, I hacked together a few bash commands to do just that.

A few notes about this data, and the (in)accuracy thereof:

  1. The script ran once every 40 minutes, collecting the 30 most popular stories (i.e. those on the front page), and adding them to the list if they were new
  2. I only know I started roughly 425 days ago because the first link in log.txt was this one right here (Who needs timestamps? I have IDs!)
  3. A not-insignificant percent (probably ~10%) of the time, the script would fail because the stupid(, stupid, stupid) Python 2 script I banged out in 10 minutes didn’t know how to handle Unicode characters properly (oops).
  4. I saved everything to a flat file with tab delineation. I probably should’ve used something else, but I didn’t, so here we are.
  5. I only saved the score from the first time a story was found, so theoretically any given post only had an arbitrary 40 minute window to accumulate points, at most. This is probably not strictly true for a number of reasons, but I’m going to pretend it is.
  6. These bash commands grew organically (often with much help from StackOverflow), so they made sense to me at the time, but YMMV
  7. The data is probably inaccurate in a million small ways, but overall, it’s at least worth poking at.

Okay, let’s get down to it!

Read More

dot-man

I recently hacked together a little 300-line bash script to manage my dotfiles called dot-man. Basically, it will let you manage your dotfiles in a git repository, and you can run it every so often to keep your local / remote dotfiles up to date.

Install is as simple as:

git clone git@github.com:cneill/dot-man.git
OR
git clone https://github.com/cneill/dot-man.git

Let me know what you think! You can find me on Twitter @ccneill.

7 Small Reasons to Love Vim

These are some cool things you can do with Vim that save time and can help prevent mistakes from mouse selection. They’re mostly little things, but altogether they make up an editing environment that I simply love.

1. NERDTree (Docs) file deletion

<Ctrl-L> to open NERDTree, hjkl to move, mdy to delete

2. Easymotion (Docs). Check out their example GIFs, and you’ll never see movement with the keyboard the same again.

3. Executing shell commands without changing windows

:!ls ~  :!rm -rf ~/old.txt

4. Deleting everything inside quotation marks, function blocks, parameters lists, or tags

di" di' di` di{  di(  di[  di< (Delete text within first matched pair)
dit   (Delete text inside first matched "tag" e.g.: <div>TEXT</div>)

5. Selecting/deleting large blocks of text

Selecting: V <Ctrl-F> (page by page)
           V 500j (select 500 lines)
Deleting: d500d (delete 500 lines)

6. Searching Dash (paid app, but worth it) using dash.vim (Docs)

:Dash each underscore  :Dash Vim

7. Deleting only blank lines on either side of the cursor

In ~/.vimrc:
" Ctrl-up/down deletes blank line below/above, and Ctrl-k/j inserts.
nnoremap <silent><C-Down> m`:silent +g/\m^\s*$/d<CR>``:noh<CR>
nnoremap <silent><C-Up> m`:silent -g/\m^\s*$/d<CR>``:noh<CR>
nnoremap <silent><C-j> :set paste<CR>m`o<Esc>``:set nopaste<CR>
nnoremap <silent><C-k> :set paste<CR>m`O<Esc>``:set nopaste<CR>

If you have more awesome Vim tricks, shoot them to me in the comments!

50 Linux Resources For Developers

I try to always bookmark interesting things I find as I bumble around the internet. I’ve collected thousands of bookmarks over the years, and I want to share some of the cool stuff I’ve found. I call these Nuggets.

Today, I want to bring you a list of links that might help you on your path to understanding and appreciating Linux. I don’t consider myself some wizened Linux guru, but I have spent many, many hours looking for guides and tools to make my life easier while using it.

If you’ve ever struggled to find information about Linux basics, or you just want to polish up your skills, there’s probably something here for you. This guide will be particularly focused on developers, but there will be information here that’s applicable to many other Linux users. Some of it is specific to Ubuntu users, but much of it is applicable across the board.

I’ve by no means covered everything, so comment or tweet to me if you have any you think I should include.

Read More