scikit-learn

There’s a lot more to Python’s `scikit` than I realized:

~~~~~~~~~~~~~~~~~~~
[~]% port search scikit

py27-scikit-image @0.10.1 (python, science)
Image processing algorithms for SciPy.

py27-scikit-learn @0.15.2 (python, science)
Easy-to-use and general-purpose machine learning in Python

py27-scikits-bootstrap @0.3.2 (python, science)
Bootstrap confidence interval estimation routines for SciPy.

py27-scikits-bvp_solver @1.1 (python, science)
bvp_solver is a Python package for solving two-point boundary value
problems.

py27-scikits-module @0.1 (python)
provides the files common to all scikits

py27-scikits-samplerate @0.3.3 (python, science, audio)
A Python module for high quality audio resampling
~~~~~~~~~~~~~~~~~~~

And this is just the Python 2.7 offerings. There are similar offerings for 2.6, 3.2, and 3.3.

Speaking of 3.3, it looks like most of the libraries with which I work now have 3.3 compatible versions? Time to upgrade myself?

I also installed `scrapy` this morning. I’m not quite ready to scrape the web for my own work, but the library looked like it had some useful functionality that I could at least begin to get familiar with.

**EDITED** to defeat autocorrect, which had changed `scikit` to *sickout* and `scrapy` to *scrap* without my noticing.

**Also**: many thanks again to Michel Fortin and his amazing [Markdown PHP plug-in][]. The code fence blocks are a real time-saver: no need to indent everything with four spaces after a colon. Just block off code with the same number of tildes on either end. *Done*.

[Markdown PHP plug-in]: https://michelf.ca/projects/php-markdown/

Towards a Better eReader

Chances are good that a good number of people received an e-reader of some kind this Christmas. From the price drops at Amazon to the amazing number of really great looking models I saw at our local Barnes and Noble, the opportunity to have hundreds of books in your pocket or bag has never been better. My first Kindle was the fourth generation model that broke the $100 barrier, when Amazon offered it for $80 a few years ago (with ads).

Some time in early autumn of this year, I decided I both liked reading on the Kindle and that the Kindle on which I was reading was not offering the best reading experience. I decided to make the jump from the plain jane Kindle 4 that I had to the Kindle Paperwhite. While reading on the Kindle 4 was pleasant enough, its lack of a built-in light and it’s always *just a bit too gray* screen meant that I didn’t like to read on it for long stretches of time. However, its light weight and slim profile made it much more comfortable to hold than my iPad Air.

*And, yes, I realize that the iPad Mini is out there. My daughter has the newer version, and it’s quite compelling, but I wasn’t quite ready to make the $350 leap to a new reading experience that the iPad Mini entails. And, besides, this note is going to in a slightly different direction, as we’ll see in a moment.*

So I clicked the button and got the Kindle Paperwhite. The higher resolution and whiter screen made a noticeable difference in my reading experience. This was much more like a book. It’s not perfect: the e-ink technology isn’t quite *there* yet, but it is awfully close. So much so that I began to wonder about page sizes, and thus about the size of the Kindle itself.

And here I should note that while I do enjoy reading a fair amount of general fiction and non-fiction on my Kindle, I also enjoy reading any number of technical books that I have purchased from [O’Reilly][], [Packt Publishing][], and [Pragmatic Programmers][], all of whom are generous enough to offer books in any of the three formats of PDF, EPUB, or MOBI that I prefer.

And I often prefer multiple versions: I switch between reading a text as a PDF on my iPad and as a MOBI document on my Kindle, but, to be honest, the shifting nature of the EPUB and the small screen of the Kindle sometimes combine to make me long for a larger Kindle, a Kindle Pro, as it were that would be able to display trade-sized pages, most of which could be acceptably displayed in the blacks and whites of the current e-ink technology.

Yes, I recognize that I am, in fact, longing for a revived, higher-resolution Kindle DX, but surely I cannot be the only academic who likes to read on something as light and easy to use as the Kindle, and academics are not the only professionals who read periodicals, including law, engineering, and medical journals, almost all of which are published in sizes like 6 x 9, 7 x 9, or 7 x 10 inches. If their contents are made available electronically, it is almost always as a PDF, not as an EPUB.

That means a Kindle more the size of the current iPad Air than anything else. To make the comparison more clear, I photographed my Kindle atop the notebook which accompanies wherever I go, an [A5-sized Leuchtturm notebook using the Whitelines technology][lw].

The Paperwhite’s display looks good, but when you compare it to an iPad as well as to a book page, you get a much clearer sense of what you’re missing:

I don’t think physical context should be easily dismissed, but, perhaps just as importantly, the small size of the Kindle “page” means that the kind of larger illustrations or code spans that dominate technical and professional publications are too often truncated or made difficult to follow or parse. A closer look at an actual page versus its representation as a PDF on an iPad is a powerful contrast:

The page is the page, if only a bit smaller — and, in GoodReader, like most apps of this kind I assume, the difference is easily made up through a quick zoom.

The smallness of the Kindle is also revealed when you stack it with a couple of professional periodicals and a technical book. Its mass paperback size emerges pretty clearly:

There’s nothing wrong with that, but I am, honestly, surprised that Amazon, given its interest in playing with market segmentation, as glimpsed with the presence of the Kindle, the Kindle Paperwhite, and, now the high-end Kindle Voyage, as well as the many sizes of the Kindle Fire and the Kindle Fire TV and the Fire TV Stick, has not returned to offer a large format e-reader.

Would I buy one when I already have an iPad Air? My answer is a considered, *yes*. What I like about the Kindle is that with a cost of approximately $100, I do not worry about the device the way I do a $500+ iPad. I throw it in a bag; I slip it in a coat pocket. I carry it by hand, usually face-down against my notebook, when I run errands. Equipped with these two things and a pen or pencil, I am free to read as I like and to write as I like. Combine those things with my smart phone and I have something like a portable office. Include my laptop, and I can work for days.

It is not even clear to me, now, that if I had such a device if I would replace my iPad Air when its time comes, as all such devices dependent on more complex operating systems surely will. As my time becomes more precious, and I want to be more productive, I find that I spend less time watching video and more time reading, using my drive times to listen to podcasts or audiobooks. While many find that an iPad is all the device they need, I find that my MacBook Pro is the multi-purpose computing device I prefer, and what I want as an accompaniment is something which makes my reading easier. And I find that the expense and the screen of the iPad are simply not as comfortable to me as the Kindle, and that is why I want a Kindle Pro, or something like it. Who’s with me?

[O’Reilly]: http://oreillly.com/
[Packt Publishing]: https://www.packtpub.com
[Pragmatic Programmers]: https://pragprog.com
[lw]: http://www.amazon.com/gp/product/B00IYL8ZV8/ref=as_li_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00IYL8ZV8&linkCode=as2&tag=johnlaudun-20&linkId=XJPE76GLQZLSKKRI

Real Science Sunday

What better way to spend a Sunday than watching some great videos that explain or treat science? [Real Clear Science][] has you covered. I am particularly fond of the video of [Schrodinger’s Cat][]. I am less fond of the Neil de Grasse Tyson’s response to the question of genetically modified foods. I think Tyson’s shows that expertise in one realm, astrophysics, does not transfer to other realms, biology. It’s my understanding, at least, that the kind of tinkering with the genome achieved through plant hybridization is different from that achieved through direct tweaking of genes. I’ve checked with biologists, and they say I’m right and Neil is wrong. *Sigh.* Neil, Neil, Neil.

[Real Clear Science]: http://www.realclearscience.com/
[Schrodinger’s Cat]: http://www.realclearscience.com/video/2014/10/15/schrdingers_cat_a_wonderful_explainer.html

New Power

*Gah!* It has been 15 years since [_The Clue Train Manifesto_][] went live (April 1999 for those wanting precision), and it would appear that a good chunk of the corporate world still needs it explained to them. Witness the _Harvard Business Review_’s [“Understanding “New Power””][], which somehow manages to fuse, really confuse, in their own words “increasing political protest, a crisis in representation and governance, and upstart businesses upending traditional industries.” Okay, so one should be generous with HBR: it doesn’t really serve them to think about things like income inequality. (More seriously, HBR can often be a lot smarter than its home in the Harvard Business School, which begot the world the MBA, and thus probably deserves a special place in the annals of *Ideas That Destroyed Civilization As We Know It (And Just When Things Were Looking Up)*.)

HBR’s definitions of old and new power are reasonable, however:

> Power, as British philosopher Bertrand Russell defined it, is simply “the ability to produce intended effects.” Old power and new power produce these effects differently. New power models are enabled by peer coordination and the agency of the crowd—without participation, they are just empty vessels. Old power is enabled by what people or organizations own, know, or control that nobody else does—once old power models lose that, they lose their advantage.

And we can only hope that someone, somewhere is making notes on where there is excitement and innovation in the world and where there is not.

[_The Clue Train Manifesto_]: http://www.cluetrain.com
[“Understanding “New Power””]: https://hbr.org/2014/12/understanding-new-power

FBI Careers

The FBI is looking for [special (cyber) agents, computer scientists, and information technology forensic examiners][fbi]. *Hmmm.* I find myself intrigued. Or, put another way, I’m developing some relevant skills, and I wouldn’t mind the improvement in pay, insurance, and retirement. Maybe my next uncle will be named Sam.

[fbi]: http://www.fbijobs.gov/cybercareers/

SL-1

If you like films from that era, be sure to check out the (rather dry) film [SL-1][] by the Atomic Energy Commission describing the only fatal nuclear meltdown in the U.S. [Wikipedia article for the curious.][w]

[SL-1]: https://www.youtube.com/watch?v=Q0zT9ARfsT4
[w]: http://en.wikipedia.org/wiki/SL-1

FLAK

[Brad Paley][] and [Edward Tufte][] are both fond of turning to past work in order to understand how to do better work in the present. Tufts is probably better known, but I actually had the chance to meet Paley and to talk with him, and he is a generous, insightful individual. Like them, I like to explore design work from the past to see what it can teach us in the present about communicating effectively and eloquently.

Case in point is this Army Air Force film entitled simply _FLAK_ (1944, T.F. I-3389), which you can watch for yourself on [Youtube][] — and it’s probably available at the Internet Archive, if you look for it. The term is itself from the German abbreviation of *Flugzeugabwehrkanone* [air defense cannon], and reveals, perhaps that it was the preferred term over *Ack-Ack* which was the British term for *anti-aircraft* defenses. The film itself, like others from this era, is a master-class in what you can do with limited resources, both in terms of time and money, and when your need to communicate is urgent: in this case the film is addressed to air crews who are, perhaps, not keen to follow proscribed course and altitude changes or who want to make changes but are told not to. I can only imagine how terrifying flak was to experience while bouncing around inside one of those bombers, and any film would have to overcome that fear.

There is probably a much longer post for the visualizations included in the film, and in others like it, and none of them will be well served by stills, but here is at least one example that, I hope, captures the spirit of the visuals used:

Sti

The still comes from a moment in the film where the narration has explained how long it takes for air defenses to acquire a target, establishing its course and altitude; predict its course; communicate that prediction as firing directions to associated batteries; and for the batteries to fire and the shells to reach the targeted area and explode. It was something like 35 seconds in 1944. The recommendation was to make course changes about one second for every 1000 feet of altitude, and the segment that follows explains why simply short zig-zags are actually dangerous — because they can so easily be normalized.

The lines and graphics unfold across maps and landscapes in ways that make complete sense and allow viewers to “see” for themselves the way things work.

[Brad Paley]: http://wbpaley.com/brad/
[Edward Tufte]: http://www.edwardtufte.com/tufte/
[Youtube]: https://www.youtube.com/watch?v=PIYVwqHM488

historydata

What a terrific idea: Lincoln Mullen has uploaded [sample data sets for historians learning R][cran]. His note states that “they include population, institutional, religious, military, and prosopographical data suitable for mapping, quantitative analysis, and network analysis.” I would love to see something similar done for folklore studies, and I’ll see what I can to make that happen.

In the mean time, many thanks to Lincoln for doing this. One of the crossroads many at which individuals find themselves when they begin the journey towards computation is not having any material with which to work. Quite often, writers describing their work assume that everyone already has a corpus of material with which to work. Or, we act as though anyone is going to pull stuff off [Project Gutenberg][pg]. A controlled data set gives new users a chance to try things out and get predictable results.

[cran]: http://cran.r-project.org/web/packages/historydata/index.html
[pg]: http://www.gutenberg.org

Christmas Quotes

This year’s presents are getting quotations that are, somehow, relevant to the contents.

For one present (a tool long borrowed from my stepfather):

> For the apparel oft proclaims the man,
> And they in France of the best rank and station
> Are most select and generous, chief in that.
> Neither a borrower nor a lender be;
> For loan oft loses both itself and friend,
> And borrowing dulls the edge of husbandry.

For my wife, who has embarked upon a course in her research that threatens to converge on my own:

> Egon Spengler: There’s something very important I forgot to tell you.
> Peter Venkman: What?
> Spengler: Don’t cross the streams.
> Venkman: Why?
> Spengler: It would be bad.
> Venkman: I’m fuzzy on the whole good/bad thing. What do you mean, “bad”?
> Spengler: Try to imagine all life as you know it stopping instantaneously and every molecule in your body exploding at the speed of light.
> Ray Stantz: Total protonic reversal!

Inside is a copy of Shelley Jackson’s _Patchwork Girl_ published by Eastgate on a USB flash drive.