Get up and running with DS in minutes.
npm install @tangent/ds
deno add @tangent/ds
<script type="module">
import * as ds from 'https://cdn.jsdelivr.net/npm/@tangent/ds/+esm';
</script>
Let’s run a simple t-test to compare two groups.
import * as ds from '@tangent/ds';
const penguinsResponse = await fetch(
'https://cdn.jsdelivr.net/npm/vega-datasets@2/data/penguins.json',
);
const penguinsData = await penguinsResponse.json();
console.table(penguinsData.slice(0, 5));
const tested_variable = "Body Mass (g)";
const adelie_var = penguinsData
.filter((d) => d.Species == "Adelie")
.map((d) => d[tested_variable]);
const chinstrap_var = penguinsData
.filter((d) => d.Species == "Chinstrap")
.map((d) => d[tested_variable]);
const ttest = ds.stats.hypothesis.twoSampleTTest(adelie_var, gentoo_var);
console.log(ttest);
{
statistic: -18.42963067630639,
pValue: 0.0020000000000000018,
df: 274,
mean1: 3676.315789473684,
mean2: 5035.080645161291,
pooledSE: 73.72718854504609,
alternative: "two-sided"
}
— ICICICICICICICICI
DS uses a declarative approach where you describe your data and analysis:
const penguinsFeatureFields = [
'Beak Length (mm)',
'Beak Depth (mm)',
'Flipper Length (mm)',
'Body Mass (g)',
];
const pcaData = penguinsData.map(d =>
penguinsFeatureFields.reduce((row, field) => {
row[field] = d[field];
return row;
}, {})
);
const pca = new ds.mva.PCA({
center: true,
scale: true,
scaling: 2, // correlation biplot
omit_missing: true
});
pca.fit({data: pcaData});
Many methods follow the fit-transform pattern from scikit-learn:
const penguinsSplit = ds.ml.validation.trainTestSplit(
{ data: penguinsData, X: penguinsFeatureFields, y: 'Species' },
{ ratio: 0.7, shuffle: true, seed: 42 }
);
const penguinScaler = new ds.ml.preprocessing.StandardScaler()
.fit({ data: penguinsSplit.train.data, columns: penguinsFeatureFields });
const penguinsTrainScaled = penguinScaler.transform({
data: penguinsSplit.train.data,
columns: penguinsFeatureFields,
encoders: penguinsSplit.train.metadata.encoders
});
const penguinsTestScaled = penguinScaler.transform({
data: penguinsSplit.test.data,
columns: penguinsFeatureFields,
encoders: penguinsSplit.train.metadata.encoders
});
DS works seamlessly with Observable Plot for visualization:
import * as Plot from '@observablehq/plot';
ds.plot.ordiplot(pcaEstimator.model, {
colorBy: penguinsData.map(d => d.Species),
showLoadings: true,
}).show(Plot);
Check out the Examples page with live, runnable code snippets.
Want to contribute? Clone the repository and install dependencies:
git clone https://github.com/tangent-to/ds.git
cd ds
npm install
# Run tests
npm test
# Build
npm run build
See CONTRIBUTING.md for more details.