public class JSATData extends Object
SimpleDataSet
,
and ClassificationDataSet
and RegressionDataSet
datasets can
be read back in as their original types.ARFF
or LIBSVM
file. This is because
JSAT always uses 32 or 64 bits (4 or 8 bytes) for every value, where values
stored as a string could use as little as 1 or 2 bytes for simple values.
However, JSAT's storage will be consistent - data which uses the floating
point values (such as the number "0.25098039215686274") will use additional
bytes in the human readable ARFF and LIBSVM formats that are not necessary,
where the binary JSAT format will stay the same size.GZIPOutputStream
when storing a dataset, and then decompressed when
read back in using GZIPInputStream
.
Modifier and Type | Class and Description |
---|---|
static class |
JSATData.DatasetTypeMarker |
static class |
JSATData.FloatStorageMethod |
Modifier and Type | Field and Description |
---|---|
static byte[] |
MAGIC_NUMBER |
static byte |
STRING_ENCODING_ASCII |
static byte |
STRING_ENCODING_UTF_16 |
Modifier and Type | Method and Description |
---|---|
static DataSet<?> |
load(InputStream inRaw)
This loads a JSAT dataset from an input stream, and will not do any of
its own buffering.
|
protected static DataSet<?> |
load(InputStream inRaw,
boolean forceAsStandard)
This loads a JSAT dataset from an input stream, and will not do any of
its own buffering.
|
static ClassificationDataSet |
loadClassification(InputStream inRaw)
Loads in a JSAT dataset as a
ClassificationDataSet . |
static RegressionDataSet |
loadRegression(InputStream inRaw)
Loads in a JSAT dataset as a
RegressionDataSet . |
static SimpleDataSet |
loadSimple(InputStream inRaw)
Loads in a JSAT dataset as a
SimpleDataSet . |
static <Type extends DataSet<Type>> |
writeData(DataSet<Type> dataset,
OutputStream outRaw)
This method writes out a JSAT dataset to a binary format that can be read
in again later, and could be read in other languages.
The format that is used will understand both ClassificationDataSet and RegressionDataSet datasets as
special cases, and will store the target values in the binary file. |
static <Type extends DataSet<Type>> |
writeData(DataSet<Type> dataset,
OutputStream outRaw,
JSATData.FloatStorageMethod fpStore)
This method writes out a JSAT dataset to a binary format that can be read
in again later, and could be read in other languages.
The format that is used will understand both ClassificationDataSet and RegressionDataSet datasets as
special cases, and will store the target values in the binary file. |
public static final byte[] MAGIC_NUMBER
public static final byte STRING_ENCODING_ASCII
public static final byte STRING_ENCODING_UTF_16
public static <Type extends DataSet<Type>> void writeData(DataSet<Type> dataset, OutputStream outRaw) throws IOException
ClassificationDataSet
and RegressionDataSet
datasets as
special cases, and will store the target values in the binary file. When
read back in, they can be returned as their original dataset type, or
treated as normal fields as a SimpleDataSet
.Type
- dataset
- the dataset to write out to a binary fileoutRaw
- the raw output stream, the caller should provide a buffered
stream.IOException
public static <Type extends DataSet<Type>> void writeData(DataSet<Type> dataset, OutputStream outRaw, JSATData.FloatStorageMethod fpStore) throws IOException
ClassificationDataSet
and RegressionDataSet
datasets as
special cases, and will store the target values in the binary file. When
read back in, they can be returned as their original dataset type, or
treated as normal fields as a SimpleDataSet
.Type
- dataset
- the dataset to write out to a binary fileoutRaw
- the raw output stream, the caller should provide a buffered
stream.fpStore
- the storage method of storing floating point values, which
may result in a loss of precision depending on the method chosen.IOException
public static DataSet<?> load(InputStream inRaw) throws IOException
SimpleDataSet
, ClassificationDataSet
, or
RegressionDataSet
depending on what type of dataset was
originally written out.Type
- inRaw
- the input stream, caller should buffer itIOException
public static SimpleDataSet loadSimple(InputStream inRaw) throws IOException
SimpleDataSet
. So long as the input
stream is valid, this will not fail.inRaw
- the input stream, caller should buffer itIOException
public static ClassificationDataSet loadClassification(InputStream inRaw) throws IOException
ClassificationDataSet
. An exception
will be thrown if the original dataset in the file was not a
ClassificationDataSet
.inRaw
- the input stream, caller should buffer itIOException
ClassCastException
- if the original dataset was a not a ClassificationDataSetpublic static RegressionDataSet loadRegression(InputStream inRaw) throws IOException
RegressionDataSet
. An exception
will be thrown if the original dataset in the file was not a
RegressionDataSet
.inRaw
- the input stream, caller should buffer itIOException
ClassCastException
- if the original dataset was a not a RegressionDataSetprotected static DataSet<?> load(InputStream inRaw, boolean forceAsStandard) throws IOException
SimpleDataSet
, ClassificationDataSet
, or
RegressionDataSet
depending on what type of dataset was
originally written out.SimpleDataSet
.Type
- inRaw
- the input stream, caller should buffer itforceAsStandard
- true
for for the dataset to be loaded as a
SimpleDataSet
, otherwise it will be determined based on the input
streams contents.IOException
Copyright © 2017. All rights reserved.