sasnets package

Submodules

sasnets.analysis module

File used for analysis of SASNet networks using various techniques, including dendrograms and confusion matrices.

sasnets.analysis.cpredict(model, x, l=71, pl=5000)[source]

Runs a Keras model to create a confusion matrix.

Parameters:
  • model – Model to use.
  • x – A list of x values to predict on.
  • l – The number of input models.
  • pl – The number of data iterations per model.
Returns:

A confusion matrix of percentages

sasnets.analysis.dcluster(model, x, names)[source]

Displays a dendrogram clustering based on the confusion matrix.

Parameters:
  • model – The model to predict on.
  • x – A list of x values to predict on.
  • names – List of all model names.
Returns:

The dendrogram object.

sasnets.analysis.fit(mn, q, iq)[source]

Fit resulting data using bumps server. Currently unimplemented.

Parameters:
  • mn – Model name.
  • q – List of q values.
  • iq – List of I(q) values.
Returns:

Bumps fit.

sasnets.analysis.load_from(path)[source]

Loads a model from the specified path.

Parameters:path – Relative or absolute path to the .h5 model file
Returns:The loaded model.
sasnets.analysis.main(args)[source]

Main method. Called from command line; uses argparse.

Parameters:args – Arguments from command line.
Returns:None.
sasnets.analysis.predict(model, x, names, num=5)[source]

Runs a Keras model to predict based on input.

Parameters:
  • model – The model to use.
  • x – The x inputs to predict from.
  • names – A list of all model names.
  • num – The top num probabilities and models will be printed.
Returns:

None

sasnets.analysis.predict_and_val(model, x, y, names)[source]

Runs the model on the input datasets and compares the results with the correct labels provided from y.

Parameters:
  • model – The model to evaluate.
  • x – List of x values to predict on.
  • y – List of y values to predict on.
  • names – A list of all possible model names.
Returns:

Two lists, il and nl, which are the indices of the model and its proper name respectively.

sasnets.analysis.rpredict(model, x, names)[source]

Same as predict, but outputs names only.

Parameters:
  • model – The model to use.
  • x – List of x to predict on.
  • names – List of all model names.
Returns:

List of predicted names.

sasnets.analysis.tcluster(model, x, names)[source]

Displays a t-SNE cluster coloured by the model predicted labels.

Parameters:
  • model – Model to use.
  • x – List of x values to predict on.
  • names – List of all model names.
Returns:

The tSNE object that was plotted.

sasnets.hyp module

sasnets.hyp.data()[source]
sasnets.hyp.model(xtrain, ytrain, xtest, ytest)[source]

sasnets.sas_io module

Collection of utility IO functions used in SASNet. Contains the read from disk functions as well as the SQL generator.

sasnets.sas_io.read_h(l)[source]

Read helper for parallel read.

Parameters:l – A list of filenames to read from.
Returns:Three lists, Q, IQ, and Y, corresponding to Q data, I(Q) data, and model labels respectively.
sasnets.sas_io.read_parallel_1d(path, pattern='_eval_')[source]

Reads all files in the folder path. Opens the files whose names match the regex pattern. Returns lists of Q, I(Q), and ID. Path can be a relative or absolute path. Uses Pool and map to speed up IO. WIP. Uses an excessive amount of memory currently. It is recommended to use sequential on systems with less than 16 GiB of memory.

Calling parallel on 69 150k line files, a gc, and parallel on 69 5k line files takes around 70 seconds. Running sequential on both sets without a gc takes around 562 seconds. Parallel peaks at 15 + GB of memory used with two file reading threads. Sequential peaks at around 7 to 10 GB. Use at your own risk. Be prepared to kill the threads and/or press the reset button.

Assumes files contain 1D data.

Parameters:
  • path – Path to the directory of files to read from.
  • pattern – A regex. Only files matching this regex are opened.
sasnets.sas_io.read_seq_1d(path, pattern='_eval_', typef='aggr', verbosity=False)[source]

Reads all files in the folder path. Opens the files whose names match the regex pattern. Returns lists of Q, I(Q), and ID. Path can be a relative or absolute path. Uses a single thread only. It is recommended to use read_parallel_1d(), except in hyperopt, where map() is broken.

typef is one of ‘json’ or ‘aggr’. JSON mode reads in all and only json files in the folder specified by path. aggr mode reads in aggregated data files. See sasmodels/generate_sets.py for more about these formats.

Assumes files contain 1D data.

Parameters:
  • path – Path to the directory of files to read from.
  • pattern – A regex. Only files matching this regex are opened.
  • typef – Type of file to read (aggregate data or json data).
  • verbosity – Controls the verbosity of output.
sasnets.sas_io.sql_dat_gen(dname, mname, dbname='sas_data', host='127.0.0.1', user='sasnets', encoder=None)[source]

A Pythonic generator that gets its data from a PostgreSQL database. Yields a (iq, diq) list and a label list.

Parameters:
  • dname – The data table name to connect to.
  • mname – The metadata table name to connect to.
  • dbname – The database name.
  • host – The database host.
  • user – The username to connect as.
  • encoder – LabelEncoder for transforming labels to categorical ints.
Returns:

None

sasnets.sasnet module

Module contents