Installing Python

Why Python and scikit-learn

Here’s a quick read about the strengths of Python and scikit-learn. Six reasons why I recommend scikit-learn by  Ben Lorica (est. 0 hr.)

Numerous, heavy duty examples of what  predictive analytics can be used for can be found by perusing the Kaggle Competitions website. Make sure you select “All Competitions” and both the “Active” and “Completed” check-boxes. Although we won’t get to some of the more advanced topics, Python/sci-kit learn has been used by many of the competition winners! And, the example code at Kaggle is often written in Python/scikit-learn.

If you find any of these competition topics (open or closed) especially interesting, drop us an e-mail, and maybe we can take a look at the topic area a little bit sometime in the last two weeks of the course!

Installing Python, scikit-learn, and IPython

Follow the instructions in the Installation screenshow for your system. Feel free to do the install on more than one computer. Bringing a laptop to class (if you have one) is encouraged, but, if you are like me, when at home I’d rather work on my desktop.

Installing Continuum’s Anaconda Python on Windows

Here’s the Continuum Analytics website for the Anaconda download. Make sure you read the instructions on their website as well as watching my installation screenshow.

And here’s the screenshow for the Windows installation:

Watch me now: the Install Windows screenshow (html)

Download for later (right click on Windows): the Install Windows screenshow (mp4)

Installing Continuum’s Anaconda Python on a Mac

First, watch the Install Windows screenshow from above (it’s only about 15 minutes) and has a lot of good background stuff in it that I won’t be putting into the Mac installation screenshow.

And here’s the screenshow for the Macintosh installation:

Watch me now: the Install Macintosh screenshow (html)

Download for later (right click on Windows): the Install Macintosh screenshow (mp4)

ASSIGNMENT: Work Along with the Preview Video

…download the IPython notebook Preview file to work along with the Preview screenshow. When you click on the link, you will see a “Download as zip” in the upper right of the page it takes you to — that will get you the folder in its entirety.

MORE FUN (optional): pandas dataframes and sci-kit learn

pandas Cookbook

Julia Evans is a lot of fun (and maybe a little zany). You can watch her PyCon 2014 presentation on You Tube showing off the power of Pandas dataframes — it will be a lot more fund than the lecture videos 🙂

And, on the theory of “pay me now or pay me later” you may want to work through the IPyhon notebooks in Julia’s GREAT pandas Cookbook tutorial. It’s not as much fun as watching her on video, but it is more recent and more thorough and a GREAT intro to Pandas. You will find the IPython notebooks in their own folder in the My IPyhon Notebooks download I had you install, so you can just work along with her notes.

sci-kit learn Tutorial

If you want to see the power of the analytic tools we will be using, you can spend some time watching Jake Vanderplas’ great sci-kit learn tutorial (3 hours plus) from PyCon 2014. Again, you will find the IPython notebooks to go along with this in your My IPython Notebooks. The tutorial is NOT intended to explain WHY you use the techniques, but rather it shows you how.