datamash

Datamash

datamash is a command-line program to perform basic numeric, textual and statistical operations on tabular data. It can be used for tasks like calculations, sorting, summarizations etc. on CSV files and tabular data.
datamash image
data csv statistics calculations

Datamash: Command-Line Program for Tabular Data Operations

A command-line program performing basic numeric, textual, and statistical operations on tabular data, useful for calculations, sorting, and summarizations on CSV files.

What is Datamash?

datamash is an open-source command-line program used to perform basic numeric, textual and statistical operations on tabular data files. It allows you to easily do tasks like calculations, sorting, and summarizations on data in text files, CSVs, and other tabular data formats.

Some key features and capabilities of datamash include:

  • Performing basic statistics like mean, median, max, min, count, sum, stddev etc. on numeric data columns
  • Textual operations like count, unique, groupby on text columns
  • Sorting data on one or more columns
  • Filtering rows based on conditions
  • Joining multiple files by a common field
  • Handling large data files with good performance
  • Easy to use syntax, even for those without programming experience
  • Output results to console, files, or as JSON/YAML

datamash can help with exploratory data analysis and data cleaning tasks in data science, analysis and reporting workflows. It's included by default in many Linux distributions. With its focus on tabular data transformations, datamash can be a lightweight and faster alternative to other solutions like R, Python or Excel for some use cases.

Datamash Features

Features

  1. Perform basic calculations on data
  2. Sort data
  3. Summarize data
  4. Operate on CSV files and tabular data

Pricing

  • Open Source

Pros

Free and open source

Lightweight and fast

Easy to use command line interface

Supports common data operations

Cons

Limited to command line usage

Less features than full statistical software

Requires familiarity with Unix-style tools


The Best Datamash Alternatives

Top Office & Productivity and Data Processing and other similar apps like Datamash


R (programming language) icon

R (programming language)

R is an open-source programming language and free software environment for statistical computing, bioinformatics, graphics, data science, and general-purpose programming. The R language provides a wide variety of statistical analysis techniques and graphical capabilities which make it a popular choice for data analysis and visualization.Some key features of R include:Open-source...
R (programming language) image
Gawk icon

Gawk

Gawk is a text processing program and pattern scanning language designed for processing text files. It is the GNU implementation of awk, a popular data extraction and reporting tool originally found in Unix operating systems.Some key features of Gawk include:Supports common programming constructs like variables, loops, conditionals, functions, and moreIncludes...
Gawk image
Mawk icon

Mawk

Mawk is an interpreter for the Awk programming language that is focused on pattern scanning and processing text. Awk is commonly used as a data extraction and reporting tool and is well-suited for formatting text files and outputting formatted reports.Some key features of Mawk include:Supports Awk language for text processing,...
Mawk image
Surveytagger icon

Surveytagger

Surveytagger is an online survey and form building application designed to help users create customizable surveys, polls, quizzes, and other data collection tools. Some key features of Surveytagger include:Drag-and-drop survey builder with a wide variety of question types including single/multi choice, rating scales, open-ended, demographics, images, and more.Options to add...