ESAC DATA ANALYSIS AND STATISTICS WORKSHOP 2014
(Original ANNOUNCEMENT)

 

TABLE OF CONTENTS

 

DATES & LOCATION

The workshop will take place on 27-31 October 2014, in rooms D1/D2 at ESAC.

 

TUTORS

 

NOTEBOOKS AND NOTES

 

AGENDA

    • Monday 27 October: Model Fitting (theory)
      • 9:00 am - 9:30 am: Welcome and Installations troubleshooting
      • 9:30 am - 11:00 am: Model fitting lecture (Jake VanderPlasNotebook
      • 11:00 am - 11:30 am: coffee/tea break 
      • 11:30 am - 13:00 pm: Model fitting hands-on session (Jake VanderPlas)
      • 13:00 pm - 14:00 pm: Lunch @ ESAC canteen
      • 14:00 pm - 15:30 pm: Model fitting lecture (Jake VanderPlas)
      • 15:30 pm - 16:00 pm: coffee/tea break 
      • 16:00 pm - 17:30 pm: Model fitting hands-on session (Jake VanderPlas)

     

    • Tuesday 28 OctoberModel Fitting (hands-on) (1/2); Model Selection (theory) (1/2)
      • 9:30 am - 11:00 am: Model fitting lecture (Luis SarroNotebook
      • 11:00 am - 11:30 am: coffee/tea break
      • 11:30 am - 13:00 pm: Model fitting hands-on session (Luis Sarro)
      • 13:00 pm - 14:00 pm: Lunch @ ESAC canteen
      • 14:00 pm - 15:30 pm: Model selection lecture (Luis Sarro)
      • 15:30 pm - 16:00 pm: coffee/tea break and group picture
      • 16:00 pm - 17:30 pm: Model selection hands-on session (Luis Sarro)

     

    • Wednesday 29 OctoberModel Selection (theory and hands-on)
      • 9:30 am - 11:00 am: Model selection lecture (Luis SarroNotebook
      • 11:00 am - 11:30 am: coffee/tea break 
      • 11:30 am - 13:00 pm: Poisson Statistics in High-Energy Astrophysics (Andy PollockSlides
      • 13:00 pm - 14:00 pm: Lunch @ ESAC canteen
      • 14:00 pm - 15:30 pm: Hackathon: each one plays with their data with us around
      • 15:30 pm - 16:00 pm: coffee/tea break
      • 16:00 pm - 17:30 pm: Presentations of results from Hackaton

     

    • Thursday 30 OctoberTime Domain Astronomy and Data Mining (theory and hands-on)

     

    • Friday 31 October: Data Mining (theory and hands-on)
      • 9:30 am - 11:00 am: Data mining lecture (Jake VanderPlasNotebook
      • 11:00 am - 11:30 am: coffee/tea break
      • 11:30 am - 13:00 pm: Data mining hands-on session (Jake VanderPlas)
      • 13:00 pm - 14:00 pm: Lunch @ ESAC canteen
      • 14:00 pm - 15:30 pm: Data mining lecture (Jake VanderPlas)
      • 15:30 pm - 16:00 pm: coffee/tea break
      • 16:00 pm - 17:30 pm: Data mining hands-on session (Jake VanderPlas)

 

READING MATERIAL

In order to benefit most from the workshop, we recommend that you read the following papers in advance:

 

INSTALLATION INSTRUCTIONS

Participants shoud install the following software before the workshop:

  • python, with the following packages: numpymatplotlibscipyscikit-learnemcee
  • ipython notebook
  • heasoft (link) - only for poissonian statistics lecture
  • R (link) - optional

MAC OS X

On Mac OS X, X11/XQuartz will have to be installed, in addition to the software mentioned above.

All instructions below assume that the bash shell is used, as it is the default shell on Mac OS X.
(Adapt instructions accordingly if you changed your default shell.)

PYTHON & IPYTHON NOTEBOOK

We recommend the all-in-one scientific Python installer Anaconda.

  1. Download Anaconda from http://continuum.io/downloads
    For Mac OS X 10.7 (Lion), 10.8 (Mountain Lion), or 10.9 (Mavericks), pick "Mac OS X — 64-Bit Python 2.7 Graphical Installer"
    If you have 
    Mac OS X 10.6 (Snow Leopard), you may use an older version of anaconda
  2. Double-click to install, and be sure to leave the default "Modify PATH" option

Most of the necessary python modules already come by default with Anaconda: numpymatplotlibscipyscikit-learn.

The only python module that needs to be added is emcee:

  1. Install emcee in anaconda:
    conda install -c williamsmj emcee

Test the installation:

  1. Launch python:
      python
    This should start python, and the version should mention Anaconda.

    Exit with Control-D.
  2. Launch ipython:
      ipython
    This should start python, and the version should mention Anaconda.
    Exit with Control-D.
  3. Launch ipython notebook:
      ipython notebook
    This should open your default browser, and present you with a .
    Exit by closing the page (in the browser) and 
    with Control-C (in the terminal).
    Note: When the OS language is not English, ipython notebook may crash with the error "ValueError: unknown locale: UTF-8".
    In that case, before launching ipython notebook, type:
      export LC_CTYPE=en_GB.UTF-8
  4. Launch python and test the different modules:
      import numpy
      print numpy.__version__
      import matplotlib
      print matplotlib.__version__
      import scipy
      print scipy.__version__
      import sklearn
      print sklearn.__version__
      import emcee
      print emcee.__version__
    All the python modules should load properly, and they should all print their version.
    Exit with Control-D.

R

Optional

  1. Download R from http://cran.es.r-project.org/bin/macosx/
    There are 2 packages, so be sure to pick the one that matches your version of Mac OS X:
    R-3.1.1-snowleopard.pkg for Mac OS X 10.6 (Snow Leopard), 10.7 (Lion), 10.8 (Mountain Lion)
    R-3.1.1-mavericks.pkg for Mac OS X 10.9 (Mavericks)
  2. Double-click to install

HEASOFT

Only needed for the Poissonian Statistics Lecture.

We recommend downloading pre-compiled binaries. Link to detailed instructions.

  1. Go to: http://heasarc.gsfc.nasa.gov/docs/software/lheasoft/download.html
  2. In Step 1, select the pre-compiled binary distribution for your operating system: Mac OS X 10.7 (Lion), 10.8 (Mountain Lion), or 10.9 (Mavericks).
    The current version of heasoft is not available for Mac OS X 10.6 (Snow Leopard).
  3. In Step 2, check "All"
  4. Click "submit", and wait for the download to finish
    The file will named heasoft-6.16mac_intel_darwin13.tar.gz (or a very similar name, adapt instructions accordingly)
  5. Copy the downloaded file in the directory where you want to install the software
    We suggest to use: /usr/local/heasoft/
  6. Uncompress the file:
      tar xvfz heasoft-6.16mac_intel_darwin13.tar.gz
  7. Go into the heasoft-6.16/PLATFORM/BUILD_DIR directory, where PLATFORM will be i386-apple-darwin12.5.0 (or a very similar name, adapt instructions accordingly):
      cd heasoft-6.16/i386-apple-darwin12.5.0/BUILD_DIR/
  8. Configure the software:
      ./configure

Every time you want to use the heasoft in a new terminal, you need to initialize it:
  export HEADAS=/usr/local/heasoft/heasoft-6.16/i386-apple-darwin12.5.0
  . $HEADAS/headas-init.sh

Test the installation:

  1. In a terminal, type:
      fversion
    This should return the version of heasoft

X11/XQUARTZ

Some software included in heasoft, as well as some python modules, require X11 (also known as XQuartz).

  1. Download XQuartz from: http://xquartz.macosforge.org/
    The same version works for Mac OS X 10.6 (Snow Leopard), 10.7 (Lion), 10.8 (Mountain Lion), and 10.9 (Mavericks).
  2. Double-click to install

Test the installation:

  1. In a terminal, type:
      xeyes
    This should open a small window with eyes that follow your mouse.
    Exit by closing the window, and quitting X11.

LINUX

All instructions below assume that the bash shell is used; adapt instructions accordingly if you use a different shell.

PYTHON & IPYTHON NOTEBOOK

We recommend the all-in-one scientific Python installer Anaconda.

  1. Download Anaconda from http://continuum.io/downloads
    The file will be named Anaconda-2.1.0-Linux-x86.sh (or a very similar name, adapt instructions accordingly)
  2. Install Anaconda with:
      bash Anaconda-2.1.0-Linux-x86.sh
    Note that you should type bash, regardless of whether or not you are actually using the bash shell.
    Follow the text-only prompts.
    When there is a colon at the bottom of the screen press the down arrow to move down through the text.
  3. Type yes and press enter to approve the license.
  4. Press enter to approve the default location for the files.
  5. Type yes and press enter to prepend Anaconda to your PATH (this makes the Anaconda distribution the default Python).

Most of the necessary python modules already come by default with Anaconda: numpymatplotlibscipyscikit-learn.

The only python module that needs to be added is emcee:

  1. Install emcee in anaconda (on a 64-bit linux):
    conda install -c lrp emcee
    Note: if you are on a 32-bit linux, use the following command instead:
    conda install -c auto emcee

Test the installation:

  1. Launch python:
      python
    This should start python, and the version should mention Anaconda.

    Exit with Control-D.
  2. Launch ipython:
      ipython
    This should start python, and the version should mention Anaconda.
    Exit with Control-D.
  3. Launch ipython notebook:
      ipython notebook
    This should open your default browser, and present you with a .
    Exit by closing the page (in the browser) and 
    with Control-C (in the terminal).
  4. Launch python and test the different modules:
      import numpy
      print numpy.__version__
      import matplotlib
      print matplotlib.__version__
      import scipy
      print scipy.__version__
      import sklearn
      print sklearn.__version__
      import emcee
      print emcee.__version__
    All the python modules should load properly, and they should all print their version.
    Exit with Control-D.

R

Optional

  1. Download R from http://cran.es.r-project.org/bin/linux/ or alternatively use your Linux package management system
  2. Install the package

HEASOFT

Only needed for the Poissonian Statistics Lecture.

We suggest downloading pre-compiled binaries. Link to detailed instructionsNotes on the portability of pre-compiled Linux binaries.

  1. Go to: http://heasarc.gsfc.nasa.gov/docs/software/lheasoft/download.html
  2. In Step 1, select the pre-compiled binary distribution for your operating system: Linux 32-bit, or Linux 64-bit.
  3. In Step 2, check "All"
  4. Click "submit", and wait for the download to finish
    The file will named heasoft-6.16pc_linux64.tar.gz (or a very similar name, adapt instructions accordingly)
  5. Copy the downloaded file in the directory where you want to install the software
    We suggest to use: /usr/local/heasoft/
  6. Uncompress the file:
      tar xvfz heasoft-6.16pc_linux64.tar.gz
  7. Go into the heasoft-6.16/PLATFORM/BUILD_DIR directory, where PLATFORM will be i686-pc-linux-gnu-libc2.5 (or a very similar name, adapt instructions accordingly):
      cd heasoft-6.16/i686-pc-linux-gnu-libc2.5/BUILD_DIR/
  8. Configure the software:
      ./configure

Every time you want to use the heasoft in a new terminal, you need to initialize it:
  export HEADAS=/usr/local/heasoft/heasoft-6.16/i686-pc-linux-gnu-libc2.5
  . $HEADAS/headas-init.sh

Test the installation:

  1. In a terminal, type:
      fversion
    This should return the version of heasoft

WINDOWS

The main issue in Windows is the lack of a packaged version of emcee.

PYTHON & IPYTHON NOTEBOOK

We recommend the all-in-one scientific Python installer Anaconda.

  1. Download Anaconda from http://continuum.io/downloads
    The file will be named Anaconda-2.1.0-Windows-x86_64.exe
     (or a very similar name, adapt instructions accordingly)
  2. This package contains Python 2.7.
  3. Install Anaconda following the wizard and accepting all the defaults.

Most of the necessary python modules already come by default with Anaconda: numpymatplotlibscipyscikit-learn.

The only python module that needs to be added is emcee but emcee is not available packaged for Windows, so it should be downloaded from GitHub and installed:

  1. Download a ZIP package with the emcee code from https://github.com/dfm/emcee/zipball/master
  2. Unpack the archive in a temporary directory
  3. Change to the temporary directory created in step 2 and run:

python setup.py install

​This will add emcee to the package library managed by Anaconda.

Test the installation:

  1. Launch python:
      python
    This should start python, and the version should mention Anaconda.

    Exit with Control-D.
  2. Launch ipython:
      ipython
    This should start python, and the version should mention Anaconda.
    Exit with Control-D.
  3. Launch ipython notebook:
      ipython notebook
    This should open your default browser, and present you with a .
    Exit by closing the page (in the browser) and 
    with Control-C (in the terminal).
  4. Launch python and test the different modules:
      import numpy
      print numpy.__version__
      import matplotlib
      print matplotlib.__version__
      import scipy
      print scipy.__version__
      import sklearn
      print sklearn.__version__
      import emcee
      print emcee.__version__
    All the python modules should load properly, and they should all print their version.
    Exit with Control-D.

R

Optional

  1. Download R from http://cran.r-project.org/bin/windows/base/
  2. Install the package following the wizard and accepting all defaults.

HEASOFT

Only needed for the Poissonian Statistics Lecture.

Heasoft suite only works in Windows through Cygwin (32-bit) as described here. First we need to set up Cygwin.

CYGWIN

  1. Download Cygwin from http://cygwin.com
  2. Choose the 32-bit version as required for Heasoft.
  3. Install Cygwin following the wizard and the instructions found here.
  4. Make sure that the mirror site chosen during the installation of Cygwin is near your location by checking the domain name of the site (.ac.uk, .de, .es, .fr, etc.). Distant mirror sites can be very slow.

HEASOFT

  1. Download the pre-compiled Windows (Cygwin) binaries from http://heasarc.gsfc.nasa.gov/lheasoft/download.html
  2. Select all heasoft packages except for Suzaku and FV/GUIs which are not available for Windows and click on Submit as instructed.
  3. Uncompress the file with any archiving utility like 7-Zip or PeaZip.
  4. We suggest to use an appropriate top-level directory like C:\heasoft. (NOTE: When using Cygwin's shell, there is a symbolic link called cygdrive/c that allows easy access to Windows drive letters. The heasoft location would be cygdrive/c/heasoft within Cygwin's shell. There are other symlinks to other Windows partitions if available.)
  5. Start the Cygwin shell through the shortcut and change directory to e.g. /cygdrive/c/heasoft.
  6. Go into the heasoft-6.16/PLATFORM/BUILD_DIR directory, where PLATFORM will be i686-pc-cygwin (or a very similar name, adapt instructions accordingly):
  7.   cd heasoft-6.16/i686-pc-cygwin/BUILD_DIR/
  8. Configure the software:
      ./configure

Every time you want to use the heasoft in a new terminal (Cygwin session), you need to initialize it:
  export HEADAS=/usr/local/heasoft/heasoft-6.16/i686-pc-cygwin
  . $HEADAS/headas-init.sh

Test the installation:

  1. In a terminal, type:
      fversion
    This should return the version of heasoft

 

ORGANIZING COMMITTEE

  • Guillaume Belanger
  • Hervé Bouy
  • Carlos Gabriel
  • Matteo Guainazzi
  • Jean-Christophe Leyder
  • Bruno Merín (chair)
  • William O'Mullane
  • Andy Pollock
  • Elena Puga
  • Álvaro Ribas
  • Celia Sánchez
  • Roland Vavrek

 

FUNDING

The SOC warmly thanks the ESAC Science Faculty for fully funding this workshop.