super_gui

This module provides the superplot GUI.

class superplot.super_gui.GUIControl(data_file, info_file, default_plot_type=0)[source]

Main GUI element for superplot. Presents controls for selecting plot options, creating a plot, and saving a plot.

Parameters:
  • data_file (string) – Path to chain file
  • info_file – Path to info file
  • xindex (integer) – Default x-data index
  • yindex (integer) – Default y-data index
  • zindex (integer) – Default z-data index
  • default_plot_type (integer) – Default plot type index
superplot.super_gui.main()[source]

SuperPlot program - open relevant files and make GUI.

superplot.super_gui.message_dialog(message_type, message)[source]

Show a message dialog.

Parameters:
  • message_type (gtk.MessageType) – Type of dialogue - e.g gtk.MESSAGE_WARNING or gtk.MESSAGE_ERROR
  • message (string) – Text to show in dialogue
superplot.super_gui.open_file_gui(window_title='Open', set_name=None, add_pattern=None, allow_no_file=True)[source]

GUI for opening a file with a file browser.

Parameters:
  • window_title (string) – Window title
  • set_name (string) – Title of filter
  • add_pattern (list) – Acceptable file patterns in filter, e.g [“*.pdf”]
  • allow_no_file (bool) – Allow for no file to be selected
Returns:

Name of file selected with GUI.

Return type:

string

superplot.super_gui.save_file_gui(window_title='Save As', set_name=None, add_pattern=None)[source]

GUI for saving a file with a file browser.

Parameters:
  • window_title (string) – Window title
  • set_name (string) – Title of filter
  • add_pattern (list) – Acceptable file patterns in filter, e.g [“*.pdf”]
Returns:

Name of file selected with GUI.

Return type:

string

statslib

Point Statistical Functions

This module contains statistical functions that return a single data point.

superplot.statslib.point.p_value(chi_sq, dof)[source]

Calculate the \(\textrm{$p$-value}\) from a chi-squared distribution:

\[\textrm{$p$-value} \equiv \int_\chi^2^\infty f(x; k) dx\]
Parameters:
  • chi_sq (numpy.ndarray) – Data column of chi-squared
  • dof (integer) – Number of degrees of freedom
Returns:

A p-value for the given chi_sq, dof

Return type:

numpy.float64

>>> round(p_value(data[1], 2), DOCTEST_PRECISION)
0.9999991597

One Dimensional Statistical Functions

This module contains all the functions for analyzing a chain (*.txt file) and calculating the 1D stats for a particular variable.

Two Dimensional Statistical Functions

This module contains all the functions for analyzing a chain (*.txt file) and calculating the 2D stats for a particular pair of variables.

Kernel Density Estimation (KDE)

This module contains a class for implementing weighted KDE with or without fast Fourier transforms (FFT).

Hacked Scipy code to support weighted KDE and Fast-fourier transforms.

See discussion on stackoverflow

class superplot.statslib.kde.gaussian_kde(dataset, bw_method='scott', weights=None, fft=True)[source]

Representation of a kernel-density estimate using Gaussian kernels.

Kernel density estimation is a way to estimate the probability density function (PDF) of a random variable in a non-parametric way. gaussian_kde works for both uni-variate and multi-variate data. It includes automatic bandwidth determination. The estimation works best for a unimodal distribution; bimodal or multi-modal distributions tend to be oversmoothed.

dataset : array_like
Datapoints to estimate from. In case of univariate data this is a 1-D array, otherwise a 2-D array with shape (# of dims, # of data).
bw_method : str, scalar or callable, optional
The method used to calculate the estimator bandwidth. This can be ‘scott’, ‘silverman’, a scalar constant or a callable. If a scalar, this will be used directly as kde.factor. If a callable, it should take a gaussian_kde instance as only parameter and return a scalar. If None (default), ‘scott’ is used. See Notes for more details.
weights : array_like, shape (n, ), optional, default: None
An array of weights, of the same shape as x. Each value in x only contributes its associated weight towards the bin count (instead of 1).
fft : bool
Whether to use Fast-fourier transforms. Can be much faster.
dataset : ndarray
The dataset with which gaussian_kde was initialized.
d : int
Number of dimensions.
n : int
Number of datapoints.
neff : float
Effective sample size using Kish’s approximation.
factor : float
The bandwidth factor, obtained from kde.covariance_factor, with which the covariance matrix is multiplied.
covariance : ndarray
The covariance matrix of dataset, scaled by the calculated bandwidth (kde.factor).
inv_cov : ndarray
The inverse of covariance.
kde.evaluate(points) : ndarray
Evaluate the estimated pdf on a provided set of points.
kde(points) : ndarray
Same as kde.evaluate(points)
kde.pdf(points) : ndarray
Alias for kde.evaluate(points).
kde.set_bandwidth(bw_method=’scott’) : None
Computes the bandwidth, i.e. the coefficient that multiplies the data covariance matrix to obtain the kernel covariance matrix. .. versionadded:: 0.11.0
kde.covariance_factor : float
Computes the coefficient (kde.factor) that multiplies the data covariance matrix to obtain the kernel covariance matrix. The default is scotts_factor. A subclass can overwrite this method to provide a different method, or set it through a call to kde.set_bandwidth.

Bandwidth selection strongly influences the estimate obtained from the KDE (much more so than the actual shape of the kernel). Bandwidth selection can be done by a “rule of thumb”, by cross-validation, by “plug-in methods” or by other means; see [3], [4] for reviews. gaussian_kde uses a rule of thumb, the default is Scott’s Rule.

Scott’s Rule [1], implemented as scotts_factor, is:

n**(-1./(d+4)),

with n the number of data points and d the number of dimensions. Silverman’s Rule [2], implemented as silverman_factor, is:

(n * (d + 2) / 4.)**(-1. / (d + 4)).

Good general descriptions of kernel density estimation can be found in [1] and [2], the mathematics for this multi-dimensional implementation can be found in [1].

[1](1, 2, 3) D.W. Scott, “Multivariate Density Estimation: Theory, Practice, and Visualization”, John Wiley & Sons, New York, Chicester, 1992.
[2](1, 2) B.W. Silverman, “Density Estimation for Statistics and Data Analysis”, Vol. 26, Monographs on Statistics and Applied Probability, Chapman and Hall, London, 1986.
[3]B.A. Turlach, “Bandwidth Selection in Kernel Density Estimation: A Review”, CORE and Institut de Statistique, Vol. 19, pp. 1-33, 1993.
[4]D.M. Bashtannyk and R.J. Hyndman, “Bandwidth selection for kernel conditional density estimation”, Computational Statistics & Data Analysis, Vol. 36, pp. 279-298, 2001.

Generate some random two-dimensional data:

>>> from scipy import stats
>>> def measure(n):
>>>     "Measurement model, return two coupled measurements."
>>>     m1 = np.random.normal(size=n)
>>>     m2 = np.random.normal(scale=0.5, size=n)
>>>     return m1+m2, m1-m2
>>> m1, m2 = measure(2000)
>>> xmin = m1.min()
>>> xmax = m1.max()
>>> ymin = m2.min()
>>> ymax = m2.max()

Perform a kernel density estimate on the data:

>>> X, Y = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
>>> positions = np.vstack([X.ravel(), Y.ravel()])
>>> values = np.vstack([m1, m2])
>>> kernel = stats.gaussian_kde(values)
>>> Z = np.reshape(kernel(positions).T, X.shape)

Plot the results:

>>> import matplotlib.pyplot as plt
>>> fig = plt.figure()
>>> ax = fig.add_subplot(111)
>>> ax.imshow(np.rot90(Z), cmap=plt.cm.gist_earth_r,
...           extent=[xmin, xmax, ymin, ymax])
>>> ax.plot(m1, m2, 'k.', markersize=2)
>>> ax.set_xlim([xmin, xmax])
>>> ax.set_ylim([ymin, ymax])
>>> plt.show()

plotlib

plotlib.base

This module contains abstract base classes, used to implement Plots.

class superplot.plotlib.base.OneDimPlot(data, plot_options)[source]

Abstract base class for one dimensional plot types. Handles initialization tasks common to one dimensional plots.

class superplot.plotlib.base.Plot(data, plot_options)[source]

Abstract base class for all plot types. Specifies interface for creating a plot object, and getting the figure associated with it. Does any common preprocessing / init (IE log scaling).

Parameters:
  • data (dict) – Data dictionary loaded from chain file by data_loader
  • plot_options (namedtuple) – plot_options.plot_options configuration tuple.
figure()[source]

Abstract method - return the pyplot figure associated with this plot.

Returns:Matplotlib figure, list of plot specific summary strings
Return type:named tuple (figure: matplotlib.figure.Figure, summary: list)
class plot_data(figure, summary)

Return data type for figure() method.

figure
summary
class superplot.plotlib.base.TwoDimPlot(data, plot_options)[source]

Abstract base class for two dimensional plot types (plus the 3D scatter plot which is an honorary two dimensional plot for now). Handles initialization tasks common to these plot types.

plotlib.plot_mod

General functions for plotting data, defined once so that they can be used/edited in a consistent manner.

superplot.plotlib.plot_mod.appearance(plot_name)[source]

Specify the plot’s appearance, with e.g. font types etc. from an mplstyle file.

Options in the style sheet associated with the plot name override any in default.mplstyle.

Parameters:plot_name (string) – Name of the plot (class name)
superplot.plotlib.plot_mod.legend(leg_title=None, leg_position=None)[source]

Turn on the legend.

Warning

Legend properties specfied in by mplstyle, but could be overridden here.

Parameters:
  • leg_title (string) – Title of legend
  • leg_position (string) – Position of legend
superplot.plotlib.plot_mod.plot_band(x_data, y_data, width, ax, scheme)[source]

Plot a band around a line.

This is typically for a theoretical error. Vary x by +/- width and find the variation in y. Fill between these largest and smallest y for a given x.

Parameters:
  • x_data (numpy.ndarray) – x-data to be plotted
  • y_data (numpy.ndarray) – y-data to be plotted
  • width (integer) – Width of band - width on the left and right hand-side
  • ax (matplotlib.axes.Axes) – An axis object to plot the band on
  • scheme (schemes.Scheme) – Object containing appearance options, colours etc
superplot.plotlib.plot_mod.plot_contour(data, levels, scheme, bin_limits)[source]

Make unfilled contours for a plot.

Parameters:
  • data (numpy.ndarray) – Data to be contoured
  • levels (list [float,]) – Levels at which to draw contours
  • scheme (schemes.Scheme) – Object containing appearance options, colours etc
  • bin_limits (list [[xmin,xmax],[ymin,ymax]]) – Bin limits
superplot.plotlib.plot_mod.plot_data(x, y, scheme, zorder=1)[source]

Plot a point with a particular color scheme.

Parameters:
  • x (numpy.ndarray, numpy.dtype) – Data to be plotted on x-axis
  • y (numpy.ndarray, numpy.dtype) – Data to be plotted on y-axis
  • scheme (schemes.Scheme) – Object containing plot appearance options
  • zorder (integer) – Draw order - lower numbers are plotted first
superplot.plotlib.plot_mod.plot_filled_contour(data, levels, scheme, bin_limits)[source]

Make filled contours for a plot.

Parameters:
  • data (numpy.ndarray) – Data to be contoured
  • levels (list [float,]) – Levels at which to draw contours
  • scheme (schemes.Scheme) – Object containing appearance options, colours etc
  • bin_limits (list [[xmin,xmax],[ymin,ymax]]) – Bin limits
superplot.plotlib.plot_mod.plot_image(data, bin_limits, plot_limits, scheme)[source]

Plot data as an image.

Warning

Interpolating perhaps misleads. If you don’t want it set interpolation=’nearest’.

Parameters:
  • data (numpy.ndarray) – x-, y- and z-data
  • bin_limits (list [[xmin,xmax],[ymin,ymax]]) – Bin limits
  • plot_limits (list [xmin,xmax,ymin,ymax]) – Plot limits
  • scheme (schemes.Scheme) – Object containing appearance options, colours etc
superplot.plotlib.plot_mod.plot_labels(xlabel, ylabel, plot_title=None, title_position='right')[source]

Plot axis labels.

Parameters:
  • xlabel (string) – Label for x-axis
  • ylabel (string) – Label for y-axis
  • plot_title (string) – Title appearing above plot
  • title_position (string) – Location of title
superplot.plotlib.plot_mod.plot_limits(ax, limits=None)[source]

If specified plot limits, set them.

Parameters:
  • ax (matplotlib.axes.Axes) – Axis object
  • limits (list [xmin,xmax,ymin,ymax]) – Plot limits
superplot.plotlib.plot_mod.plot_ticks(xticks, yticks, ax)[source]

Set the numbers of ticks on the axis.

Parameters:
  • ax (matplotlib.axes.Axes) – Axis object
  • xticks (integer) – Number of required major x ticks
  • yticks (integer) – Number of required major y ticks

plotlib.plots

Implementation of plot classes. These inherit from the classes in plotlib.base and must specify a figure() method which returns a matplotlib figure object.

Plots should also have a “description” attribute with a one line description of the type of plot.

A list of implemented plot classes plotlib.plots.plot_types is found at the bottom of this module. This is useful for the GUI, which needs to enumerate the available plots. So if a new plot type is implemented, it should be added to this list.

Also includes a function to save the current plot.

class superplot.plotlib.plots.OneDimChiSq(data, plot_options)[source]

Makes a one dimensional plot, showing delta-chisq only, and excluded regions.

class superplot.plotlib.plots.OneDimStandard(data, plot_options)[source]

Makes a one dimensional plot, showing profile likelihood, marginalised posterior, and statistics.

class superplot.plotlib.plots.Scatter(data, plot_options)[source]

Makes a three dimensional scatter plot, showing best-fit and posterior mean and credible regions and confidence intervals. The scattered points are coloured by the zdata.

class superplot.plotlib.plots.TwoDimPlotFilledPDF(data, plot_options)[source]

Makes a two dimensional plot with filled credible regions only, showing best-fit and posterior mean.

class superplot.plotlib.plots.TwoDimPlotFilledPL(data, plot_options)[source]

Makes a two dimensional plot with filled confidence intervals only, showing best-fit and posterior mean.

class superplot.plotlib.plots.TwoDimPlotPDF(data, plot_options)[source]

Makes a two dimensional marginalised posterior plot, showing best-fit and posterior mean and credible regions.

class superplot.plotlib.plots.TwoDimPlotPL(data, plot_options)[source]

Makes a two dimensional profile likelihood plot, showing best-fit and posterior mean and confidence intervals.

superplot.plotlib.plots.plot_types = [<class 'superplot.plotlib.plots.OneDimStandard'>, <class 'superplot.plotlib.plots.OneDimChiSq'>, <class 'superplot.plotlib.plots.TwoDimPlotFilledPDF'>, <class 'superplot.plotlib.plots.TwoDimPlotFilledPL'>, <class 'superplot.plotlib.plots.TwoDimPlotPDF'>, <class 'superplot.plotlib.plots.TwoDimPlotPL'>, <class 'superplot.plotlib.plots.Scatter'>]

List of Plot classes in this module.

superplot.plotlib.plots.save_plot(name)[source]

Save a plot with a descriptive name.

Warning

Figure properties specfied in by mplstyle, but could be overridden here.

Parameters:name (string) – Prefix of filename, without extension
superplot.plotlib.base.plot_types

data_loader

This module contains code for:

  • Opening and processing a \*.txt data file.
  • Opening and processing an \*.info information file.
  • Using the \*.info file to label the data.
superplot.data_loader.load(info_file, data_file)[source]

Read data from *.info file and *.txt file.

Parameters:
  • data_file (string) – Name of *.txt file
  • info_file (string) – Name of *.info file
Returns:

Dictionary with chain’s labels and array of data

Return type:

dict (labels), array (data)

plot_options

This module provides a named tuple plot_options to represent the options as selected in the UI. Also loads default values from config.yml and makes them available.

TODO: This module should also do a reasonable amount of validation
of config variables.
superplot.plot_options.default(option)[source]

Retrieve the default value of a plot option.

If no default is available, prints an error message and raises a KeyError.

Parameters:option (string) – Name of the option
Returns:Default value of specified option.
superplot.plot_options.get_config(yaml_file='config.yml')[source]

Load the config file, either from the user data directory, or if that is not available, the installed copy.

Parameters:yaml_file (str) – Name of yaml file
Returns:config
Return type:dict
class superplot.plot_options.plot_options(xindex, yindex, zindex, logx, logy, logz, plot_limits, bin_limits, cb_limits, nbins, xticks, yticks, cbticks, alpha, tau, xlabel, ylabel, zlabel, plot_title, title_position, leg_title, leg_position, show_best_fit, show_posterior_mean, show_posterior_median, show_posterior_mode, show_conf_intervals, show_credible_regions, show_posterior_pdf, show_prof_like, kde_pdf, bw_method)
alpha
bin_limits
bw_method
cb_limits
cbticks
kde_pdf
leg_position
leg_title
logx
logy
logz
nbins
plot_limits
plot_title
show_best_fit
show_conf_intervals
show_credible_regions
show_posterior_mean
show_posterior_median
show_posterior_mode
show_posterior_pdf
show_prof_like
tau
title_position
xindex
xlabel
xticks
yindex
ylabel
yticks
zindex
zlabel

schemes

This module contains the Scheme class, which is used to hold information about how individual elements should appear in a plot.

Schemes are defined in config.yml. On import, this module loads each Scheme and attaches it as a module attribute with the defined name.

class superplot.schemes.Scheme(colour=None, symbol=None, label=None, level_names=None, colour_map=None, number_colours=None, colour_bar_title=None, size=5, colours=None)[source]

Holds information for how a piece of data should be plotted. All parameters are optional - Schemes can specify any subset of the available attributes.

Parameters:
  • colour (string) – Colour for a line / point.
  • symbol (string) – Indicates point style e.g. cirlce ‘o’ or line style e.g ‘–’.
  • label (string) – Label for legend.
  • level_names (list) – List of contour level names, i.e. for confidence regions.
  • colour_map (string) – Colour map for 2D plots. Must be the name of a matplotlib colour map.
  • number_colours (int) – Number of colours to appear on colour map. If None, continuum.
  • colour_bar_title (string) – Title for colour bar.
  • size (integer) – Size of points.
  • colours (list) – List of colours to be iterated, for, e.g., filled contours.
superplot.schemes.scheme_from_yaml(yaml_file)[source]

summary

Summary of chain

A stand-alone script to print summary statistics about a chain.