2025-04-05 Timothy Rice version 1.9 * NEWS: Record release date. 2025-03-29 Timothy Rice Make various changes suggested by make syntax-check - Add copyright notice to README - Ensure README refers to COPYING and INSTALL - #include "error.h" -> #include - Use GNU link instead of postal address - Portability: Grep: use "> /dev/null" instead of "-q" - Break a long line gnulib: update to latest 2025-03-22 Timothy Rice m4: Update gitignore for latest gnulib gnulib: update to latest cfg.mk: Set UPDATE_COPYRIGHT_HOLDER to myself This data is consumed by `make update-copyright`, so it should be set to whoever's copyright notice will be updated to the latest year. Add myself to a bunch of copyright notices Put NEWS in correct format for do-release-commit-and-tag 2025-03-22 Timothy Rice Revert "rand: Add new program" This reverts commit 0ed0482e4fb05c99ed6fdf6d3c20864cbe61acfd. Adding rand(1) seemed desirable from the point of view of wanting a way to benchmark datamash(1). However, a comprehensive pseudorandom number generation engine would requires a non-trivial amount of statistical library support. There is currently little maintainer will for such effort. We can "double-revert" in future if there is interest, but for the moment it seems cleanest to not have rand(1) enter the upcoming stable release 1.9. 2025-01-28 Georg Sauthoff improve const correctness 2025-01-28 Georg Sauthoff datamash: decrease visibility of local symbols i.e. symbols that are only used inside their translation unit and don't have a prototype declaration in a header should be hidden with static. the main motiviation behind such a change basically is: 1) minimize the likelihood of possible (future) symbol clashes between different translation units or libraries 2) speed up linking 3) eliminate noise when using `-Wmissing-prototypes`, i.e. if your local symbols are 'clean', then all remaining missing-prototypes warnings point to missing includes or obsolete functions Note that a global symbol definition that shadows another doesn't necessarily yield a link error, in general. And even in the constellations where it does, for fixing, it's way more convenient to get a compile error, instead. Dima Kogan also added in the email discussion: Keeping your symbols localized is general good hygiene, and has even more upsides: - The robot reader of the code (the compiler) gains a higher level of understanding: it gets knowledge of the FULL list of callers of such a function, and is free to make deep optimizations, like inlining the function - The human reader of the code also gains a higher level of understanding. With local functions I also know about exactly what paths are possible for calling them by simply doing a text search in one file. Relatedly, the linker visibility settings are controllable also (gcc -fvisibility or attribute((visibility))), but that's important for shared libraries, and less so for executables. 2025-01-27 Georg Sauthoff datamash: add missing include for validate_double_format Inconsistencies with prototypes yield a compile error 2025-01-25 Timothy Rice AUTHORS: Add Georg Sauthoff 2025-01-25 Georg Sauthoff datamash: Add explicit arguments to crosstab_print prototype In C23, a prototype like `crosstab_print ()` implies that the function will take no arguments. This conflicts with the definition `crosstab_print (const struct crosstab* ct)`. We can demonstrate the issue by running: $ CFLAGS+="-std=c2x" ./configure && make That resulted in the following error: > src/crosstab.c:152:1: error: conflicting types for 'crosstab_print'; have 'void(const struct crosstab *)' > 152 | crosstab_print (const struct crosstab* ct) > | ^~~~~~~~~~~~~~ > In file included from src/crosstab.c:34: > src/crosstab.h:50:1: note: previous declaration of 'crosstab_print' with type 'void(void)' > 50 | crosstab_print (); > | ^~~~~~~~~~~~~~ 2025-01-25 Timothy Rice datamash: Address pure attribute error 2023-12-18 Erik Auerswald datamash: keep "getnum" operation inside fields Before, the "getnum" operation could scan into adjacent fields, because it used string operations on fields terminated by a field separator, not a NUL byte. * NEWS: Mention bug fix. * src/utils.c (extract_number): Copy field contents to NUL-terminated string to ensure string operations stay inside the field. * tests/datamash-tests-2.pl: Add new, and activate existing, tests for the corrected "getnum" behavior. * tests/datamash-vnlog.pl: Add new tests for the corrected "getnum" behavior. 2023-12-17 Erik Auerswald datamash: bug fix for decimal sep. as field sep. When the locale's decimal separator is used as field separator, reading a numeric value would continue over the field separator, resulting in an "invalid numeric value" error message. This is now prevented by re-trying to read the numeric value after the field contents have been copied to a NUL-terminated buffer. * NEWS: Mention bug fix. * THANKS: Add bug reporters. * src/field-ops.c (field_op_collect): If strtold reads past the field contents, copy field contents to temporary buffer and try strtold again. * tests/datamash-i18n-de.pl: Activate two previously failing tests, and add a test case from the recent bug report. * tests/datamash-tests-2.pl: Activate two previously failing tests. 2023-12-11 Erik Auerswald maint: fix "make syntax-check" errors * cfg.mk: Except file doc/fdl.texi from line length check. * src/utils.c (update_best_seq): Add space before opening parenthesis. 2023-12-10 Erik Auerswald tweak NEWS entry for datamash antimode bug fix A "set of numbers" implies that no number is repeated. Both the mode and antimode operations look at repeated numbers. Thus use "sequence" instead of "set" in the description. * NEWS: Replace "set" with "sequence" in latest bug fix entry. 2023-12-10 Erik Auerswald datamash: fix "antimode" operation Both "mode" and "antimode" operations are implemented in the same function "mode_value()" in file "src/utils.c". The algorithm traverses a list of sequences of numbers and attempts to remember the longest (for mode) or shortest (for antimode). Before, the sequence information was updated for every new number encountered, i.e., in general before the actual sequence length was known. This approach works for mode, but not for antimode. Now, the sequence information is updated only after a sequence has ended. There are two ways a sequence can end: (1) a different number is read, (2) there are no more numbers to read. In both cases, this new sequence information (length and value) needs to be checked against the previous longest/shortest length and possibly updated. * NEWS: Mention bug fix. * src/utils.c (mode_value): Update best sequence information only after end of sequence. Move update code to new function, because it is now used twice. (update_best_seq): New function. * tests/datamash-tests-2.pl: Activate formerly broken now working "antimode" tests. 2023-12-10 Erik Auerswald datamash: new "mode" and inactive "antimode" test * tests/datamash-tests-2.pl: Add new test for both "mode" and "antimode". The "antimode" test currently fails and is commented out. 2023-12-09 Erik Auerswald datamash: four more "mode" and "antimode" tests * tests/datamash-tests-2.pl: Duplicate regression test for "mode" to also test "antimode". Add normal test with single input value to both operations. 2023-12-08 Erik Auerswald datamash: more "mode" and "antimode" tests Kingsley G. Morse Jr. has reported a bug regarding the "antimode" operation in . Both the "mode" and "antimode" operations are implemented in the function "mode_value()" in the file "src/utils.c". Currently, there are only a few tests for each of the two operations. Add more tests to have working examples. Also add commented out tests that trigger "antimode" bugs. * tests/datamash-tests-2.pl: Add 11 tests for "mode", 11 working tests for "antimode", and 5 commented out tests for "antimode" that are currently broken. 2022-11-07 Shawn Wagner Remove obsolete autoconf AC_PROG_CC_STDC macro * configure.ac: Remove AC_PROG_CC_STDC Remove deprecated gnulib fdl module * bootstrap.conf: Remove fdl from the module list * doc/fdl.texi: Explicitly include instead of relying on gnulib to generate it. Remove deprecated gnulib non-recursive-gnulib-prefix-hack module * Makefile.am: Add AUTOMAKE_OPTIONS needed for the new approach * bootstrap.conf: Remove old module, enable suggested replacement for it. * m4/.gitignore: Update for removed module Update bootstrap notes for OpenBSD. * HACKING.md: Updates for OpenBSD 7.2, related typo fixes. 2022-11-07 Shawn Wagner Add gnulib checks for sys/random.h and getrandom(). Needed to compile rand on non-GNU OSes. * Makefile.am: Add $(LIB_GETRANDOM) to rand libraries * bootstrap.conf: Add getrandom and sys_random modules * m4/.gitignore: Updates for the new modules 2022-09-03 Erik Auerswald datamash: more --vnlog tests (legend corner cases) The vnlog format is quite permissive regarding field names. Field names can comprise non-alphanumeric characters, and field names may be duplicated. Add tests to check that GNU Datamash can handle (at least some) such vnlog input. The test with a duplicate field name is intended to help avoid accidental changes to GNU Datamash behavior, not to prescribe that the first matching field name should be used. * tests/datamash-vnlog.pl: Add two tests, the first with non- alphanumeric characters, the second with a duplicate field name. 2022-08-27 Erik Auerswald datamash: --vnlog "legend" parsing fixes * src/text-lines.c (line_record_fread): Skip prefix matching regex '^\s*#\s*' instead of regex '^[\s#]+'. * tests/datamash-vnlog.pl: Additional vnlog legend tests. 2022-08-17 Shawn Wagner Make -g and groupby take ranges of fields. So that 1-4 works instead of needin * src/op-parser.c: Accept ranges for -g and groupby * tests/datamash-tests.pl, datamash-error-msgs.pl, datamash-crosstab.pl: Some tests for the above. * doc/datamash.text: Document the above. * NEWS: Changelog entry. 2022-08-17 Timothy Rice rand: Add new program 2022-08-14 Erik Auerswald docs: adjust damatash --vnlog texinfo description * doc/datamash.texi: Add that both input and output are affected. Mark the link to the format description as a URL. 2022-08-13 Erik Auerswald tests: comments and empty lines Without any options, GNU Datamash does not support comments, and does not treat empty lines as different from any other line. With -C/--skip-comments, it ignores complete lines starting with either '#' or ';' as first non-whitespace character. With --vnlog, it ignores empty lines, complete lines where the first non-whitespace character is a '#' (followed by either '!' or '#' in the vnlog prologue), and trailing comments, i.e., the part of a non-empty, non-comment, and non-header line started with optional whitespace and a '#' character. * tests/datamash-check.pl: Add tests with empty and comment lines in the input with different options. * tests/datamash-vnlog.pl: Add trailing comment started with whitespace to existing input. Add test to verify that a Semicolon (';') does not start a trailing comment. Add a test that the header line does not support comments. 2022-08-13 Timothy Rice maint: Add vnlog to bash completion maint: Add Dima Kogan to AUTHORS/THANKS files 2022-08-13 Dima Kogan Add vnlog support Dima Kogan's vnlog data format is explained at https://github.com/dkogan/vnlog This support is experimental in GNU Datamash. 2022-08-08 Timothy Rice maint: Tidy up NEWS item for dotprod datamash: add new option -S/--seed to set random seed 2022-08-07 Erik Auerswald tests: datamash -C has no inline comments * tests/datamash-tests-2.pl: Add two tests for errors when trying to use inline comments. maint: wrap two lines to make syntax-check happy 2022-08-07 Timothy Rice maint: Fix small typo 'data' -> 'date' in NEWS datamash: new operation: dotprod 2022-08-06 Timothy Rice Add NEWS item for getrandom change Switch to getrandom for seed source Move init_random function into new randutils source 2022-08-06 Timothy Rice Remove sc_indent from syntax checks The sc_indent check was added to gnulib in 8f043c6 on 2021-09-03. By default it forces code to conform to `indent -ppi 1`. This contradicts the list of format constraints suggested at [1], in particular the GNU preference of having two spaces for each indent level. [1] https://www.gnu.org/prep/standards/standards.html#Formatting 2022-08-06 Timothy Rice datamash, decorate: add -h/-V for --help/--version maint: Fix typo 'syntaax' 2022-08-03 Timothy Rice Update gnulib to latest Includes changing or casting a couple of size_t variables/outputs to idx_t. 2022-07-23 Timothy Rice maint: Ignore vc-diffs 2022-07-23 Timothy Rice maint: post-release administrivia Automatically done by `make release RELEASE='1.8 stable'` * NEWS: Add header line for next release. * .prev-version: Record previous version. * cfg.mk (old_NEWS_hash): Auto-update. 2022-07-23 Timothy Rice version 1.8 * NEWS: Record release date. 2022-07-17 Erik Auerswald datamash: tests: decimal point as field separator * tests/datamash-tests-2.pl: Add a commented out test for the "getnum" operation with a decimal point as field separator. Adjust alignment of two existing commented out tests. 2022-07-17 Erik Auerswald datamash: tests: commented out getnum i18n tests The "getnum" operation is neither properly locale-aware nor does it ignore the locale setting. As a result floating point numbers are not extracted correctly unless the locale-specific decimal separator is a period. * tests/datamash-i18n-de.pl: Add commented out getnum tests. 2022-07-17 Erik Auerswald tests: numbers with two decimal points in input * tests/datamash-error-msgs.pl: Add tests for error message when a number with two decimal points is given as input data. tests: decimal separator as field separator * tests/datamash-i18n-de.pl: Add tests using the locale's decimal separator (a comma) as field separator. * tests/datamash-tests-2.pl: Add tests using the locale's decimal separator (a period) as field separator. tests: adjust commented out i18n test * tests/datamash-i18n-de.pl: Use a unique ID to allow commenting in without further adjustments. Align test definition fields. maint: regression test consolidation * tests/datamash-tests-2.pl: Move regression tests to have them next to each other in one group. 2022-07-11 Timothy Rice datamash: tests: Add test for crosstab with header-in, no header-out datamash: tests: Rename c3 header-in test for pcov consistent with scov datamash: tests: Add more paired-param tests with header in/out docs: Note non-standard --header-out with crosstab docs: Remove sentence about groupby/crosstab not understanding header names. 2022-07-09 Timothy Rice datamash: maint: Fix some indentation datamash: bugfix: Ensure rmdup respects --output-delimiter Fixes bug reported by Dima Kogan. datamash: bugfix: Allow crosstab to be called by header field name. Ensure `--header-in crosstab x,y` does not crash. Fixes bug reported by Dima Kogan. datamash: maint: fix long line maint: ignore side effects of make syntax-check maint: Fix minor typo datamash: maint: Add debug helper macro WHEREAMI() tests: Add the test for pcov header in/out datamash: Print all operation columns in output header Ensure `--header-out pcov x:y` shows `pcov(x,y)` in header. Fixes bug reported by Dima Kogan. 2022-07-05 Timothy Rice maint: Break long lines maint: Update version notices. 2022-07-04 Erik Auerswald maint: align datamash binning tests * tests/datamash-tests-2.pl: Align binning tests. 2022-07-04 Erik Auerswald tests: more datamash binning tests Floating point numbers as operation parameters can be specified using scientific notation (e.g., 1.2e3 for 1200.0). * tests/datamash-tests-2.pl: Add scientific notation bin sizes. 2022-07-04 Erik Auerswald tests: test parser corner cases Add tests of corner cases regarding whitespace in the operation parsing of GNU Datamash in order to avoid introducing unintended changes of behavior. * tests/datamash-parser.pl: Add tests with additional whitespace. 2022-07-03 Erik Auerswald tests: add more datamash parser tests * tests/datamash-parser.pl: Additional testing of correct and incorrect use of optional operation parameters. 2022-07-03 Erik Auerswald maint: more unique test identifiers for datamash Having unique test identifiers helps in locating the problems with test failures. It seems as if the intention is to have unique test identifiers, at least per tested binary. Fix a case of duplicated test identifiers in the two test files datamash-tests.pl and datamash-parser.pl for binning related test cases. * tests/datamash-parser.pl: Rename test identifier 'b1' to '31' and 'b2' to 'b32'. 2022-06-26 Erik Auerswald tests: add third datamash i18n test case Depending on tokenizer changes to support comma as decimal separator, three comma separated fields might trigger a problem that is avoided with two comma separated fields. * tests/datamash-i18n-de.pl: Add test case with three fields separated with commas. 2022-06-26 Erik Auerswald tests: avoid Perl warning in datamash-i18n-de.pl The string comparison `$lc_de eq undef` results in up to two warnings. If `$lc_de` is defined, the single warning Use of uninitialized value in string eq at ./tests/datamash-i18n-de.pl line 39. is emitted. This is caused by comparing with `undef` in the string equality test `eq`. If the locale `de_DE.utf8` is not found, `$lc_de` is undefined. If `$lc_de` is undefined, two warnings are emitted: Use of uninitialized value $lc_de in string eq at ./tests/datamash-i18n-de.pl line 39. Use of uninitialized value in string eq at ./tests/datamash-i18n-de.pl line 39. Using `defined()` to test if `$lc_de` is defined avoids this. * tests/datamash-i18n-de.pl: Use defined() to check if a variable is defined. 2022-06-26 Erik Auerswald tests: add second datamash i18n test case Supporting a comma as decimal separator for numeric arguments to GNU Datamash operations risks confusing a comma separated list of fields with a floating point number. * tests/datamash-i18n-de.pl: Add test case with a comma separated list of field numbers. 2022-06-25 Erik Auerswald maint: fix a typo in a comment 2022-06-25 Timothy Rice maint: Remove incorrect comment about LC_NUMERIC maint: Skip German test if de_DE.utf8 locale not found Test decimal separator in de_DE.UTF-8 locale 2022-06-25 Shawn Wagner src/decorate.c: Fix a NetBSD-specific seg fault. 2022-06-24 Erik Auerswald datamash: re-write binning for negative numbers Make the binning code more explicit regarding handling of negative binning values: - changing a negative zero into a positive zero can only be needed for negative values; - if the fractional part of a negative value is zero: - the number is the lower bound of the bucket interval, - and the number could be a negative zero; - if the fractional part of a negative value is non-zero: - the number falls into the preceding bucket interval. When testing with both negative and non-negative numbers, the new code was not slower. It even seemed to be a tiny bit faster on average. * src/field-ops.c (field_op_collect): re-write binning code for negative numbers. 2022-06-23 Timothy Rice datamash: Fix binning of negative numbers. 2022-06-18 Shawn Wagner Add framework for installing hooks into cloned git repositories. Includes a pre-commit hook that runs make syntax-check 2022-06-18 Timothy Rice maint: Convert sort+header tests from shell to perl maint: fix long lines maint: Make test indentation more consistent maint: fix long lines 2022-06-17 Timothy Rice Rename deprecated tests Deprecate -f/--full for non-linewise operations 2022-06-12 Erik Auerswald tests: add cheap I/O error test The existing tests/datamash-io-errors.sh is marked as expensive and requires two file system images prepared to provoke I/O errors for GNU Datamash to detect and report. As a result it is executed less frequently than most other tests. The new tests/datamash-io-errors-cheap.sh requires just the availability of the special file "/dev/full" to immediately provoke an I/O error on output. This test requires minimal input data and is cheap to run. * Makefile.am (TESTS): Add new test file. * tests/datamash-io-errors-cheap.sh: New file with one test. 2022-06-07 Erik Auerswald datamash: fix segmentation fault As reported by Catalin Patulea on bug-datamash@gnu.org in , GNU Datamash could crash with a segmentation fault if the unique or countunique operations were used with input data containing NUL bytes. The problem was that the field_op_get_string_ptrs() function could create more pointers than it allocated memory for if the input data contained NUL bytes. The solution is to add a check to avoid writing past the end of the "ptrs" buffer. * NEWS: Mention bug fix. * src/field-ops.c (field_op_get_string_ptrs): Do not write past the end of the "ptrs" buffer. * tests/datamash-tests-2.pl: Add tests to verify bug fix. 2022-06-06 Erik Auerswald doc: mention ms and rms in --help and man page The recent operations ms (mean square) and rms (root mean square) are listed only in the texinfo manual. Add them to both the 'datamash --help' output and the datamash(1) man page. * src/datamash.c (usage): Add ms and rms to Statistical Grouping operations. * man/datamash.x: Likewise. 2022-06-05 Erik Auerswald tests: more leading and trailing whitespace tests * tests/datamash-tests.pl: Add tests with leading whitespace, trailing whitespace, and both leading and trailing whitespace. tests: leading and trailing whitespace behavior * tests/datamash-tests.pl: Add whitespace-only tests. doc: texinfo manual adjustments * doc/datamash.texi: Mention that "cut" operation uses given field ordering. Expand on leading and trailing whitespace description. Ask for unified instead of context diff. 2022-06-04 Erik Auerswald maint: fix "make syntax-check" errors * .gitignore: Use "file system" instead of "filesystem". * tests/datamash-tests-2.pl: Wrap lines longer than 80 characters. maint: fix typo in a comment 2022-06-04 Timothy Rice Remove commented-out code Fix typo datamash: Align field_operations columns It was difficult to visually group entries with the columns zig-zagging. datamash: Alias echo -> cut and unique -> uniq 2022-06-03 Erik Auerswald tests: enable valgrind test to pass The sub-test "custom-format" of the expensive test datamash-valgrind.sh would always fail, because the datamash was supposed to sum all input numbers, but some of those are too big for 80-bit extended floating point numbers as used on x86. Thus datamash would emit an error message and return an exit code of 1. That was interpreted as a test failure. * tests/datamash-valgrind.sh: Use different valgrind error exit code to distinguish between valgrind detecting a memory leak and datamash reporting an error. 2022-06-02 Timothy Rice maint: Impose '-T small' for fullfs check maint: Ignore side-effects make check-expensive 2022-06-01 Shawn Wagner Merge branch 'master' of ssh://git.sv.gnu.org/srv/git/datamash Fix memory leaks in decorate Fix memory leaks with custom numeric precisions and formats Fix some memory leaks in crosstab code Fix a memory leak in parsing column names. 2022-06-01 Erik Auerswald doc: fix a few typos * HACKING.md: Fix typos. * build-aux/create_corrupted_file_system.sh: Fix typos in comments. * build-aux/create_small_file_system.sh: Fix typo in a comment. 2022-05-31 Shawn Add bootstrap instructions for NetBSD and OpenBSD to HACKING.md 2022-05-31 Erik Auerswald doc: fix typo and quoting in man page template This is based on a Debian patch applied to GNU Datamash during packaging. The Debian patch from Alejandro Garrido Mota modifies the generated datamash man page. The Debian patch can be found at: https://sources.debian.org/src/datamash/1.7-2/debian/patches/fix-manpage.diff/ * man/datamash.x: Spelling fix varient->variant. Consistently use ' instead of \' as single quote character. 2022-05-30 Erik Auerswald doc: tweak help output and man pages Mention the LC_NUMERIC environment variable in datamash --help output and the generated man page. Add an "Options:" header line to the output of datamash --help and decorate --help so that help2man detects the OPTIONS section. Use DESCRIPTION instead of OVERVIEW to add text from the man page template to the DESCRIPTION section of the decorate(1) man page. Move the examples from the decorate(1) man page template into the EXAMPLES section of the generated man page. * man/decorate.x: Rename OVERVIEW to DESCRIPTION. Move EXAMPLES. * src/datamash.c (usage): Add "Options:" header and "Environment:" section to help output. * src/decorate.c (usage): Add "Options:" header to help output. 2022-05-29 Erik Auerswald datamash: fix short option -c X The short option -c X is documented as a short alternative to --collapse-delimiter=X, but it is not accepted. Fix this. * src/datamash.c (short_options): Add "c:". * tests/datamash-tests.pl: Test use of -c. 2022-05-29 Erik Auerswald doc: leading and trailing whitespace with -W With the -W, --whitespace option, datamash(1) ignores leading whitespace, but not trailing whitespace. Trailing whitespace results in an additional empty field at the end of the line. This is different to field splitting in Awk, Bash, or Python. This behavior is tested for, but not documented. Thus mention it in the texinfo documentation. * doc/datamash.texi: Add a senetence to the -W, --whitespace option description. 2022-05-29 Erik Auerswald doc: tweak NEWS entries for new features There are new features for both datamash(1) and decorate(1). Ensure that every new feature description mentions the program. Additionally, move the new decorate(1) feature entry below the new datamash(1) features to keep chronological order of entries. * NEWS: Always mention the changed program. Move new entry below existing entries. 2022-05-29 Erik Auerswald decorate: new conversions ipv6v4map and ipv6v4comp The two new decorate(1) conversion methods ipv6v4map and ipv6v4comp allow to sort on a field that may contain either an IPv6 or an IPv4 address. IPv4 and IPv6 addresses can be seen as IP addresses. Logs from a "dual-stack" application, e.g., a web server, may contain either an IPv4 or IPv6 address at a given position in each line. Thus if one wants to sort the log file on the IP address, both IPv4 and IPv6 addresses need to be accepted as sort key and sorted consistently. One approach is to transform one address type into the other before sorting. IPv6 supports the transformation of IPv4 addresses into IPv6 addresses. There are two common methods for accomodating IPv4 addresses in IPv6: IPv4-Mapped addresses and the deprecated IPv4-Compatible addresses. Both can be used to convert a given IPv4 address to an IPv6 address. Both IPv4-Mapped and IPv4-Compatible IPv6 address ranges are reserved by IANA and always represent IPv4 addresses in a dual-stack enabled application. IPv4-Compatible addresses just add 96 leading zero bits to the 32 bit IPv4 address to create a 128 bit IPv6 address. This results in an ambiguity for the unspecified address (all-zero in both IPv4 and IPv6) and the IPv6 localhost address ::1 with the first host address of "this" network in IPv4 (0.0.0.1). IPv4-Mapped addresses avoid this ambiguity. But since IPv4-Compatible IPv6 addresses can be seen as treating the IP address (both version 4 and version 6) as a specific way to represent an integer value I think it is useful to support this transformation as well. Both conversions logically convert an IPv4 address to an IPv6 address, but the code actually creates a textual representation of an 128 bit integer from either an IPv4 or IPv6 address. Functionality like this was requested for sort(1) from GNU Coreutils: https://lists.gnu.org/archive/html/coreutils/2011-06/msg00078.html https://lists.gnu.org/r/bug-coreutils/2015-06/msg00039.html It was rejected for sort(1) from GNU Coreutils: https://www.gnu.org/software/coreutils/rejected_requests.html#sort https://lists.gnu.org/r/bug-coreutils/2015-06/msg00041.html Thus it seems appropriate for decorate(1), complementing the existing conversion methods for either IPv4 or IPv6 addresses, but not both. * NEWS: Mention new decorate conversion methods. * src/decorate-functions.c: Implement new conversion methods. (decorate_ipv6_ipv4): New function. (decorate_ipv6_ipv4_mapped): New function. (decorate_ipv6_ipv4_compat): New function. (builtin_conversions): Add new operations. * tests/decorate-errors.pl, tests/decorate-sort-tests.pl, tests/decorate-tests.pl: Add tests for new conversion methodss. 2022-05-29 Erik Auerswald doc: adjust short description in decorate man page The decorate man page used the datamash short description. Replace it with a short description of decorate. * man/decorate.x: Adjust short description. 2022-05-29 Erik Auerswald maint: fix syntax-checks * AUTHORS: Break line at 80 characters. * THANKS: Break line at 80 characters. doc: mention -C, --skip-comments in texinfo manual * doc/datamash.texi: Mention -C, --skip-comments doc: fix man page name in help output * src/datamash.c: Use PROGRAM_NAME as man page name. * src/decorate.c: Likewise. 2022-05-29 Erik Auerswald doc: mention LC_NUMERIC in texinfo manual The LC_NUMERIC environment variable affects both the decimal and thousands separator used by GNU Datamash. * doc/datamash.texi: Mention LC_NUMERIC. 2022-05-28 Erik Auerswald fix floor/ceil/up/down confusion in man page * man/datamash.x: floor is down, ceil is up. 2022-05-28 Timothy Rice Change tabs to spaces for consistency Update bug report address to bug-datamash Promote Erik to the 'members' area in AUTHORS (i.e. git push access.) Clean up trailing end-of-line whitespaces. Partial revert of 9347234 which incorrectly deleted entire lines due to poor choice of sed command. Clean up trailing end-of-line whitespaces. 2022-05-27 Shawn Wagner Squashed commit of the following: commit ee69c4d1c27e683d91163129ca1f5e43adf34243 Author: Shawn Wagner Date: Thu May 26 05:51:19 2022 -0700 Add tests for mean square and root mean square Also update the R code in the file used to verify results to include newer operations. commit 2036f7259a371420c8a90b8af31187f77a044290 Author: Shawn Date: Fri Jun 19 15:38:07 2020 -0700 Add mean square/root mean square commands. doc/datamash.texi: Documentation src/field-ops.c,src/op-defs.c,src/op-defs.h: New ms and rms commands. 2022-05-27 Timothy Rice Expand AUTHORS and THANKS 2022-05-18 Shawn Wagner Rebuild mr/.gitignore 2022-05-18 Shawn Update bash completion module contrib/bash-completion/datamash: Add new commands and options 2022-05-18 Shawn Wagner Add -c/--collapse-delimiter argument Tweaks to satisfy make syntax-check 2022-05-18 Shawn Squashed commit of the following: commit 700e3ab38e19d5f53eb332ea11fa6001505fc026 Author: Shawn Wagner Date: Sat Jun 20 01:53:04 2020 -0700 Properly escape sort commands executed by sh Also removes the arbitrary 1024 character limit on sort command. * bootstrap.conf,Makefile.am: Add sh-quote gnulib module * src/decorate.c: --print-sort-args quotes arguments when needed. * src/datamash.c: Shell escape sort command used by popen(), get rid of max length of it. * tests/datamash-sort-errors.sh,tests/decorate-tests.sh: Handle new output. commit ae84f5dddf6761d80cd08e441358ca23c96f13d6 Author: Shawn Wagner Date: Sat Jun 20 01:05:09 2020 -0700 Use an unstable sort with NetBSD /usr/bin/sort in decorate It's stable by default, unlike any other sort(1) I checked, so when using it, pass an option to make it unstable to pass tests cases. * src/decorate.c: Pass -S to /usr/bin/sort on NetBSD * tests/decorate-sort-tests.pl tests/decorate-tests.pl: Force using /usr/bin/sort on NetBSD, and account for new output in test cases. commit 267c8974a5eed4844d9ee17e3f0aa9c45fe5d95f Author: Shawn Wagner Date: Fri Jun 19 23:14:53 2020 -0700 Add a --sort-cmd argument to datamash and decorate. Allows overriding the version of sort(1) found by configure. src/datamash.c,src/decorate.c: Add --sort-cmd argument. tests/datamash-sort-errors.sh: Use it in tests designed to fail. doc/datamash.texi: Document it. commit 82fb2366e500533f91171465c511ddc3e600a53b Author: Shawn Wagner Date: Fri Jun 19 23:00:39 2020 -0700 Make configure check for sort(1) First looks for gsort, then sort, so that non-GNU userland OSes with GNU coreutils installed will prefer GNU sort. * configure.ac: test for gsort and sort * src/datamash.c,src/decorate.c: Use the path to sort found by configure. * src/datamash.c: Make it harder to overflow the sort command string. * tests/decorate-tests.pl: Fix test cases for using full path to sort 2022-05-18 Shawn Wagner Fix --help not including default value for --filler argument. Patch from Erik Auerswald. 2022-05-14 Shawn Wagner Some auditing of gnulib modules Added some missing ones for things used by datamash, removed some unused ones. Squashed commit of the following: commit ce8fd789a0d60f0592ec79567f84c09b05e2b4aa Author: Shawn Wagner Date: Sat May 14 05:43:05 2022 -0700 Update m4/ax_c_long_long.m4 from autoconf-archive latest commit d5cc572bc52d02169959361aca3f0e0823643d79 Author: Shawn Wagner Date: Sat May 14 05:34:48 2022 -0700 Update m4/.gitignore commit ef0ec5ca533067948800e3f8c10292b553cfb4eb Author: Shawn Date: Thu Apr 23 15:57:02 2020 -0700 Add missing gnulib modules used in decorate commit 67cbe0b3462dab739a3bd81a89e1e91b2d128519 Author: Shawn Date: Mon Apr 20 01:49:03 2020 -0700 Add missing gnulib modules 2022-05-14 Shawn Wagner Typo in manpage -h -> -H Patch by Erik Auerswald Fix spelling errors and other issues in texi documentation. Patch from Erik Auerswald. Fix an incorrect format in a decorate(1) message. Reported by Jiri Hladky, fixed by Erik Auerswald. Spelling fixes in documentation 2021-01-06 Assaf Gordon maint: update copyright year Run "make update-copyright" and then: * gnulib: Update to latest. * tests/init.sh, bootstrap: Update to latest from gnulib. * m4/.gitignore: Add gnulib updates. 2020-04-30 Shawn datamash: fix missing -z when calling sort(1) When sorting NUL terminated lines, the needed -z option wasn't being passed to sort(1), causing errors due to improperly sorted data. Fixed, plus raises an error if the system sort(1) doesn't support -z. * NEWS: Mention bugfix. * configure.ac: Test to see if sort(1) understands -z * datamash.c (open_input): If "-z" and "-s" are used, add "-z" to the sort(1) command line; Exit with an error if sort(1) does not support "-z". 2020-04-30 Shawn datamash,decorate: use pledge(2) on OpenBSD to restrict capabilities pledge is an OpenBSD specific mechanism for restricting what syscalls a process can execute, or limiting how they behave. It's considered polite on OpenBSD to limit a program to just the bare minimum it needs to run. See https://man.openbsd.org/pledge.2 for details. * configure.ac: Add check for pledge(2). * src/system.h (openbsd_pledge): New static function, calling pledge(2). * src/datamash.c (main), src/decorate.c (main): Call new function on startup. 2020-04-30 Shawn tests: use jot as seq on OpenBSD instead of skipping tests * init.cfg(openbsd_seq_replacement_): New function. Work around OpenBSD's lack of seq(1) using jot(1). * tests/datamash-rand.sh, tests/datamash-sort-errors.sh, tests/datamash-strbin.sh: Use new function. tests: handle getopt_long(3) differences on OpenBSD * tests/datamash-error-msgs.pl, tests/decorate-errors.pl: Work around OpenBSD's getopt_long(3) producing different error messages from GNU version. 2020-04-24 Assaf Gordon maint: update NEWS with version 1.7 and date 2020-04-24 Assaf Gordon decorate: new program (experimental) decorate(1) works in tandem with sort(1) to enable sorting by non-standard ordering, e.g. IPv4, IPv6, roman numerals. Suggested by Pádraig Brady in: https://lists.gnu.org/r/bug-coreutils/2015-06/msg00076.html * NEWS: Mention new program. * bootstrap.conf: Add sys_socket, netinet_in gnulib modules. * .gitignore: Ignore decorate related files. * Makefile.am (bin_PROGRAMS): Add decorate. (decorate_SOURCES/CFLAGS/CPPFLAGS/LDFLAGS): New items. (man_MANS): Add decorate. (TESTS): Add new tests. * src/decorate.c, src/key-compare.{c,h}: New program, adapted from https://lists.gnu.org/archive/html/coreutils/2019-03/msg00056.html * src/decorate-functions.{c,h}: New module. * man/decorate.x: New file. * src/system.h: Add quotef,quotef_n,FALLTHROUGH,SORT_FAILURE. * tests/decorate-errors.pl, tests/decorate-sort-tests.pl, tests/decorate-tests.pl: New tests. * po/POTFILES.in: Add new .c files. 2020-04-24 Assaf Gordon maint: fix syntax-checks following sha256,geomean additions * src/datamash.c (usage), tests/datamash-sha.pl, tests/datamash-stats.pl: Break line at 80 characters. * src/fields-ops.c: Fix space-before-parentheses. 2020-04-23 Assaf Gordon maint: update help2man 2020-04-23 Shawn datamash: new operations: geomean,harmmean They compute the geometric and harmonic means, thus adding support for all three classic Pythagorean means. $ seq 5 | datamash harmmean 1 2.1897810218978 $ seq 5 | datamash geomean 1 2.6051710846974 Equivalent to the following R code: > library("psych") > x = seq(5) > harmonic.mean(x) [1] 2.189781 > geometric.mean(x) [1] 2.605171 * NEWS, doc/datamash.texi, man/datamash.x: Mention new operations * bootstrap.conf: Use gnulib's logl module. * Makefile.am (datamash_LDADD): Add more math libraries. * m4/.gitignore: Add logl files. * src/op-defs.{h,c}: Define new operations. * src/field-ops.c: Implement new operations * src/datamash.c (usage): List new operations. * tests/datamash-stats.pl: Add tests. 2020-04-10 Shawn datamash: new line operations: sha224 and sha384 Added for the sake of completeness; the implementations are already present. $ printf "%s\n" 1 2 3 | datamash sha224 1 e25388fde8290dc286a6164fa2d97e551b53498dcbf7bc378eb1f178 58b2aaa0bfae7acc021b3260e941117b529b2e69de878fd7d45c61a9 4cfc3a1811fe40afa401b25ef7fa0379f1f7c1930a04f8755d678474 $ printf a | datamash sha384 1 54a59b9f22b0b80880d8427e548b7c23abd873486e1f035dce9cd697e85175033caa88e6d57bc35efae0b5afd3145f31 Equivalent to: $ printf 1 | sha224sum e25388fde8290dc286a6164fa2d97e551b53498dcbf7bc378eb1f178 - $ printf 2 | sha224sum 58b2aaa0bfae7acc021b3260e941117b529b2e69de878fd7d45c61a9 - $ printf 3 | sha224sum 4cfc3a1811fe40afa401b25ef7fa0379f1f7c1930a04f8755d678474 - $ printf a | sha384sum 54a59b9f22b0b80880d8427e548b7c23abd873486e1f035dce9cd697e85175033caa88e6d57bc35efae0b5afd3145f31 * NEWS, doc/datamash.texi, man/datamash.x: Mention new operations. * src/datamash.c (usage): List new options. * src/op-defs.{h,c}: Define new operations. * src/field-ops.c: Implement operations. * tests/datamash-sha.pl: Add tests. 2020-02-24 Assaf Gordon doc: update NEWS with version 1.6 and date build: update bootstrap.conf with xstrtol-error * bootstrap.conf: add xstrtol-error (following gnulib update, without it "make dist" fails at "update-po" stage). gnulib: update to latest * m4/.gitignore: Auto-updated by 'bootstrap'. doc: mention getnum bugfix in NEWS * NEWS: Mention bugfix 2020-02-13 Assaf Gordon datamash: bug fix 'getnum' operation ignoring 8,9 chars Digits 8,9 were wrongly ignored on in default numeric format (decimal): $ echo foo411586 | datamash getnum 1 4115 [ wrong result ] 411586 [ fixed result ] * src/utils.c (extract_number_types): Add missing digits to string pattern. * tests/datamash-tests-2.pl: Add tests to catch this error. 2020-01-02 Assaf Gordon maint: update copyright date to 2020 * all files: Run "make update-copyright". 2019-09-17 Assaf Gordon doc: update NEWS with version 1.5 and date 2019-09-03 Assaf Gordon datamash: new operation: cut Similar to cut(1), it copies the input field to the output as-is. The advantage over cut(1) is that combined with datamash's other features, input fields can be specified by name instead of column number, and output fields can be re-ordered and duplicated. Example: $ printf "a b c\n1 X 6\n" | datamash -W -H cut c,a,c cut(c) cut(a) cut(c) 6 1 6 Suggested by Torsten Seemann in https://lists.gnu.org/archive/html/bug-datamash/2019-09/msg00000.html * NEWS: Mention new operation. * src/datamash.c (usage): Add new op. * src/op-defs.{h,c}: Define new OP_CUT. * src/field-ops.{c}: Implement new OP_CUT. * contrib/bash-completion/datamash: Add new operation. * doc/datamash.texi, man/datamash.x: Mention new operation. * tests/datamash-tests-2.pl: Add tests. 2019-08-27 Assaf Gordon datamash: new operation: getnum Extract numeric values from a string: $ echo foo-42.0-bar | datamash getnum 1 42.0 Also accept single-letter option for different numeric types: $ echo zoo-42.0-bar | datamash getnum:p 1 # positive decimal (default) -42.0 $ echo zoo-42.0-bar | datamash getnum:n 1 # Natural numbers 42 $ echo zoo-42.0-bar | datamash getnum:i 1 # Integers -42 $ echo zoo-42.0-bar | datamash getnum:d 1 # Decimal -42.0 $ echo zoo-42.0-bar | datamash getnum:h 1 # Hex 66 $ echo zoo-42.0-bar | datamash getnum:o 1 # Oct 34 * NEWS: Mention new operation. * src/datamash.c (usage): Add new op. * src/op-defs.{h,c}: Define new OP_GETNUM. *src/op-parser.c (set_op_params): Parse single-letter options. (parse_operation_params): Accept character parameters (not just numbers). * src/field-ops.{c}: Implement new OP_GETNUM. * src/utils.{h,c} (extract_number): New function. * contrib/bash-completion/datamash: Add new operation. * doc/datamash.texi, man/datamash.x: Mention new operation. * tests/datamash-tests-2.pl: Add tests. 2019-08-27 Assaf Gordon build: work-around automake 1.16 regression with non-gnu make Discussed in https://lists.gnu.org/r/automake/2019-08/msg00000.html Solved by Bruno Haible: https://lists.gnu.org/r/automake/2019-08/msg00005.html https://lists.gnu.org/r/bug-gnulib/2019-08/msg00064.html * Makefile.am (LDADD): Remove $(top_builddir) from library path. 2019-08-27 Assaf Gordon gnulib: update to latest (gettext 0.20 fix) 2019-08-23 Assaf Gordon maint: don't syntax-check for trailing spaces on a unit test file maint: appease space_before_parens syntax-check maint: ignore space_before_parens syntax-check on STREQ_LEN 2019-08-23 Assaf Gordon datamash: new operations: extname, barename Extract the file extension, and the basename without the extension: $ echo foo.tar.gz | datamash barename 1 extname 1 foo tar.gz $ echo /home/foo/bar.txt \ | datamash dirname 1 basename 1 barename 1 extname 1 /home/foo bar.txt bar txt * NEWS: Mention new operation. * src/datamash.c (usage): Add new op. * src/op-defs.{h,c}: Define new OP_BARENAME,OP_EXTNAME. * src/field-ops.{c}: Implement new OP_BARENAME,OP_EXTNAME. * src/utils.{h,c} (is_add_on_extension,+guess_file_extension): New helper functions. * contrib/bash-completion/datamash: Add new operation. * doc/datamash.texi, man/datamash.x: Mention new operation. * tests/datamash-tests-2.pl: Add tests. 2019-08-23 Assaf Gordon datamash: new operations: dirname, basename Similar to dirname(1) and basename(1), but operate on a file. $ echo /home/foo/bar.txt | datamash dirname 1 basename 1 /home/foo bar.txt * NEWS: Mention new operation. * src/datamash.c (usage): Add new op. * src/op-defs.{h,c}: Define new OP_DIRNAME,OP_BASENAME. * src/field-ops.{c}: Implement new OP_DIRNAME,OP_BASENAME. * contrib/bash-completion/datamash: Add new operation. * doc/datamash.texi, man/datamash.x: Mention new operation. * tests/datamash-tests-2.pl: Add tests. 2019-08-07 Assaf Gordon datamash: accept backslash-escaped characters in field names Suggested by Renato Alves in https://lists.gnu.org/archive/html/bug-datamash/2019-08/msg00000.html The following are now possible: datamash -H sum FOO\\-BAR < input.txt datamash -H sum 'FOO\-BAR' < input.txt datamash -H sum "FOO\\-BAR" < input.txt * NEWS: Mention the feature. * doc.datamsah.texi, man/datamash.x: Document the feature. * src/op-scanner.h(MAX_IDENTIFIER_LENGTH): New limit. * src/op-scanner.c(scanner_get_token): Improve identifier scanning to allow backslashes. * tests/datamash-error-msgs.pl, tests/datamash-tests-2.pl: Add tests. 2019-08-07 Assaf Gordon datamash: fix mode/antimode incorrect results for negative values In version 1.4 and earlier, this resulted in incorrect values: $ echo -1 | datamash mode 1 1.844674407371e+19 Reported by Renan Valieris in https://lists.gnu.org/archive/html/bug-datamash/2019-06/msg00001.html * src/utils.c (mode_value): Fix type from size_t to long double. * tests/datamash-tests-2.pl: Add test. * NEWS: Mention fix. 2019-01-02 Assaf Gordon maint: update gnulib to latest * gnulib: Update to latest. * bootstrap, tests/init.sh: Sync from gnulib. maint: update copyright date to 2019 * all files: Run "make update-copyright". 2018-12-23 Assaf Gordon doc: update release date for version 1.4 * NEWS: Update date. 2018-12-22 Assaf Gordon gnulib: update to latest 2018-12-15 Assaf Gordon maint: update .gitignore Following gnulib recent update. * m4/.gitignore: Update file list. 2018-12-15 Assaf Gordon build: update autoconf version requirements Require version 2.69 to bootstrap. After recent gnulib update, gnulib-tool would complain: $ ./gnulib/gnulib-tool ./gnulib/gnulib-tool: *** minimum supported autoconf version is 2.63. \ Try adding AC_PREREQ([2.63]) to your configure.ac. ./gnulib/gnulib-tool: *** Stop. Hence this update. * configure.ac, bootstrap.conf: Require autoconf version 2.69 or later. 2018-12-14 Assaf Gordon build: add special CPPFLAGS for windows binaries FORCE_C_LOCALE: Force C/POSIX locale, instead of assuming it is available as an environment variable (not easily changeable in cmd.exe). SORT_WITHOUT_LOCALE: don't add "LC_ALL=C" to the sort command-line execution string - the windows interpreter (cmd.exe) can't handle that. * src/datamash.c (main): Add preprocessor conditionals. 2018-12-08 Assaf Gordon datamash: new option -C/--skip-comments Suggested by Mark van Rossum in https://lists.gnu.org/r/bug-datamash/2018-11/msg00001.html https://lists.gnu.org/r/bug-datamash/2018-12/msg00003.html * NEWS: Mention new option. * src/datamash.c (usage, main): Update getopt parsing and help screen. * src/text-options.{c,h} (skip_comments): New varaible. * src/text-lines.{c,h} (line_record_fread): New option to skip comments. (line_record_is_comment): New function. * contrib/bash-completion/datamash: Update bash auto-completion options. * man/datamash.x: Add "skipping comment lines" section. * tests/datamash-tests-2.pl: Add new test cases. 2018-12-08 Assaf Gordon gnulib: update to latest 2018-08-22 Assaf Gordon doc: fix man-page typo Reported by Steve Ward in https://lists.gnu.org/r/bug-datamash/2018-08/msg00004.html . * man/datamash.x: Fix missing formatting letter B. 2018-03-23 Assaf Gordon tests: disable --format="%a" test Valid output can differ (e.g. 0x8.000p-3 vs 0x1.000p+0). * tests/datamash-output-format.pl: Disable 'a1' test. 2018-03-23 Assaf Gordon tests: fix --format='%4000f' expected output Can be 1.000009... or 1.000008999, depending on representation. * tests/datamash-output-format.pl: Check only the first 5 digits. 2018-03-17 Assaf Gordon doc: update NEWS with new version and date * NEWS: Set version 1.3 and date. 2018-01-25 Assaf Gordon maint: fix syntax-check issues * doc/datamash-texinfo.css: Limit to 80 characters. * po/POTFILES.in: Add missing files. * src/double-format.c: Remove unused "ignore-value.h". * src/text-options.{c,h}: Space after parens. 2018-01-23 Assaf Gordon maint: change URLs from http to https * all files: Change to https URLs. datamash: add --format=FORMAT option * NEWS: Mention this. * Maefile.am: * doc/datamash.texi: Mention new option. * src/datamash.c * src/double-format.{c,h}: * src/text-options.{c,h}: * tests/datamash-error-msgs.pl, tests/datamash-output-format.pl: Test new option. * tests/datamash-valgrind.sh: Test large output buffer under valgrind. 2018-01-18 Assaf Gordon datamash: add -R/--round=N option Print numeric values with N decimal places. Example: $ echo 1.1 | datamash --round=5 sum 1 1.10000 Requested in https://github.com/agordon/datamash/issues/5 . * NEWS: Mention this. * src/datamash.c (usage): Mention new option. * configure.ac: Remove nonliteral-printf warning. * doc/datamash.texi: Mention new option. * src/text-options.c,src/text-options.h (numeric_output_format, numeric_output_bufsize): New variables. (set_numeric_output_precision,finalize_numeric_output_buffer): New functions to set the printf style based on requested precision. * src/fields-ops.c: Print numeric output using new format variables. * tests/datamash-error-msgs.pl: Test invalid --round usage. * tests/datamash-output-format.pl: Test --round usage. * Makefile.am: Add new test file. 2018-01-04 Assaf Gordon maint: update copyright year to 2018 * all files: Update with 'make update-copyright'. gnulib: update to latest 2017-12-11 Assaf Gordon maint: add github ISSUE/PULL-REQUEST templates Encourage contributors to use bug-datamash@gnu.org instead of github. * .github/ISSUE_TEMPLATE.txt, .github/PULL_REQUEST_TEMPLATE.txt: New files. 2017-11-15 Assaf Gordon gnulib: update to latest * m4/.gitignore: Automatically updated git newest gnulib bootstrap. 2017-11-15 Assaf Gordon datamash: new operation 'trimmean' To calculate 20% trimmed mean: $ printf "%s\n" 13 3 7 33 3 9 | datamash trimmean:0.2 1 8 Using 'trimmean:0' is equivalent to 'mean'. Using 'trimmean:0.5' is equivalent to 'median' (for R compatability). Suggested by Khavish Bhundoo in https://lists.gnu.org/archive/html/bug-datamash/2017-10/msg00004.html . * NEWS: Mention new operation. * src/datamash.c (usage): Add new op. (print_column_headers): Print percent value in headers. * src/op-defs.{h,c}: Define new OP_TRIMMED_MEAN. * src/field-ops.{h,c}: Implement new OP_TRIMMED_MEAN. * src/op-parser.c: (set_op_params): Handle 'trimmean:N' paramater. * src/utils.{h,c} (trimmed_mean_value): New function. * contrib/bash-completion/datamash: Add new operation. * doc/datamash.texi, man/datamash.x: Mention new operation. * tests/datamash-error-msgs.pl: Test 'trimmean' parameter error messages. * tests/datamash-stats.pl: Add tests, compare against R results. 2017-09-07 Assaf Gordon build: indicate which hash functions are used at end of ./configure * NEWS: Mention the change (and the link fix from previous commit). * configure.ac: Print 'internal' or 'external' based on $LIB_CRYPTO. 2017-09-07 Jeroen Roovers build: link with external crypto libraries if needed Datamash uses gnulib's sha* modules, which allow linking with OpenSSL for increased performance (using ./configure --with-openssl=yes). Adjust the LDADD flags to link with the external libraries. Reported in http://lists.gnu.org/archive/html/bug-datamash/2017-08/msg00007.html . * Makafile.am (datamash_LDADD): Add $(LIB_CRYPTO) 2017-09-06 Assaf Gordon datamash: new --output-delimiter=X option Suggested by Dave Myron in https://lists.gnu.org/archive/html/bug-datamash/2017-08/msg00008.html . * src/datamash.c (explicit_output_delimiter): New variable. (main): Handle new option. (usagE): Mention new option. * NEWS: Mention feature. * contrib/bash-completion/datamash: Add option. * doc/datamash.texi: Mention new option. * tests/datamash-tests-2.pl, tests/datamash-error-msgs.pl: Test new option. 2017-09-06 Yu Fu doc: fixed two typos Reported in: https://github.com/agordon/datamash/pull/2 . * examples/readme.md: Fixed transcribes -> transcribed. 2017-08-23 Assaf Gordon version 1.2 * NEWS: Record release date. maint: remove missing files from Makefile.am * Makefile.am: Remove files (which were deleted in previous commit). maint: remove outdated build-aux scripts * build-aux/check-remote-make-all.sh, build-aux/check-remote-make-extra.sh, build-aux/check-remote-make-git.sh, build-aux/check-remote-make.sh: Precursors the http://pretest.nongnu.org, no need for standalone scripts any more. * build-aux/gen-coverage-report.sh, build-aux/rebuild-coverage.sh: obsoleted by 'make coverage'. * build-aux/tag-new-version.sh: Removed. maint: update prerelease checks script * build-aix/prerelease-checks.sh: Remove outdated checks, add non-root-installation check. maint: fix 'syntax-check' errors * src/crosstab.c: space-before-parens. * src/datamash.c: double-word. gnulib: update to latest 2017-08-21 Assaf Gordon tests: skip tests if paste(1) is not found On Alpine linux, paste(1) is not installed by default - skip instead of failing. * init.cfg (require_paste_): New shell function. * tests/datamash-rand.sh, tests/datamash-sort-errors.sh: Skip if paste(1) is not found. 2017-08-21 Assaf Gordon tests: quote '^' in shell commands The '^' character has special meaning in older shells (e.g. OpenSolaris 10/sparc). Quote it to prevent test failures. 2017-08-21 Assaf Gordon tests: accept '-nan' under OpenBSD OpenBSD systems return '-nan' instead of 'nan' for two tests. Ignore the '-' prefix. * tests/datamash-tests-2.pl (narm12,narm15): Discard '-' prefix. 2017-08-21 Assaf Gordon datamash: fix incorrect int size for percetiles On SunOS 5.11/i86pc, sizeof(size_t)==4 while sizeof(uintmax_t)==8. This leads to incorrect printf output. Reported by Dagobert Michelseni in https://lists.gnu.org/archive/html/bug-datamash/2017-08/msg00004.html . A better long term solution is to switch entirely from size_t to uintmax_t. * src/op-parser.c, src/datamash.c: use PRIuMAX for printf, and convert to uintmax_t. 2017-08-10 Assaf Gordon maint: improve bash-completion script installation (again) Reported by Bruno Haible in https://lists.gnu.org/archive/html/bug-datamash/2017-08/msg00002.html: the current installation defaults to global location (e.g /usr/share/bash-completion.d) and does not work with non-root installation. Change default to local installation, with option to auto-detect the system's global location. Usage: ./configure --with-bash-completion-dir=[no|local|global|PATH] See README for more details. * configure.ac: Change --with-bash-copmletion-dir processing. * NEWS: Mention this. * README: Document this. 2017-08-08 Assaf Gordon bash-completion: add range,perc operations * contrib/bash-completion/datamash: Add range/perc (percentile) operations to list of suggested completions. 2017-06-08 Assaf Gordon datamash: add lines/fields option to 'check' operation Datamash will fail with non-zero exit code if the input does not have the expected number of lines/fields. Typical usage: $ seq 10 | paste - - | datamash check 2 fields && echo ok 5 lines, 2 fields ok $ seq 10 | paste - - | datamash check 6 fields && echo ok line 1 (2 fields): 1 2 datamash: check failed: line 1 has 2 fields (expecting 6) $ seq 10 | datamash check 11 lines datamash: check failed: input had 10 lines (expecting 11) * NEWS: Mention new options. * src/datamash.c (tabular_check_file): Implement additional checks. * src/op-parser.h (struct mode_check_params_t): New struct to hold options. (struct datamash_ops): Add new struct. * src/op-parser.c (parse_check_line_or_field, parse_mode_check): New functions to parse 'check' options. (parse_mode): Parse option in 'check' mode. * tests/datamash-parser.pl: Add parsing tests. * tests/datamash-check.pl: Add 'check' tests. * Makefile.am: Add 'datamash-check.pl' test script. * man/datamash.x: Add 'check' examples. * doc/datamash.texi: Expand 'check' section. 2017-06-07 Assaf Gordon maint: mark more lines as LCOV_EXCL_LINE for more accurate coverage 2017-05-26 Assaf Gordon gnulib: update to latest 2017-05-26 Assaf Gordon datamash: free crosstab memory if LINT Detected by Covrity scan (CID 72245, 72246). * src/crosstab.c (print_crosstab): Call free() if lint is enabled. 2017-05-26 Assaf Gordon build: add lint option to configure script Usage: ./configure --enable-lint Use during development to supress non-critical warnings. * configure.ac: Add --enable-lint option. * HACKING: Mention this option. 2017-05-09 Assaf Gordon build: add src/die.h to source files list Omitted from previous commit v1.1.1-10-gedab279 "datamash: use die() instead of error()". * Makefile.am (datamash_SOURCES): Add missing file. 2017-05-09 Assaf Gordon datamash: new operation 'range' Calculate the range of values in the group: $ printf '%s\n' 5 3 14 6 | datamash range 1 11 Using 'range' is equivalent to: $ printf '%s\n' 5 3 14 6 | datamash min 1 max 1 | awk '{print $2-$1}' 11 Suggested by Kingsley G. Morse Jr. in https://lists.gnu.org/archive/html/bug-datamash/2017-05/msg00000.html . * NEWS: Mention new operation. * src/datamash.c: (usage): Add new op. * src/op-defs.{h,c}: Define new OP_RANGE. * src/field-ops.{h,c}: Implement new OP_RANGE. * tests/datamash-stats.pl: Test new operation. * tests/datamash-tests-2.pl: Test new operation with various inputs. * doc/datamash.texi, man/datamash.x: Mention new operation. 2017-05-09 Assaf Gordon build: add 'fallthrough' comments to supress gcc-7.1 warnings gcc-7.1 warns about: src/op-parser.c:491:10: error: this statement may fall through [-Werror=implicit-fallthrough=] if (scan_val_int>0) ^ The fallthrough is intentional - add an explicit comment that will detected by the compiler and supress the warnings. * src/op-parser.c: Add explicit 'fallthrough' comment in two intentional fall through cases. 2017-05-09 Assaf Gordon datamash: use die() instead of error() Copied from GNU coreutils v8.25-79-g492dcb2, http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=492dcb2eb1: "die() has the advantage of being apparent to the compiler that it doesn't return, which will avoid warnings in some cases, and possibly generate better code." * src/die.h: Copied from GNU coreutils. * src/op-parser.c, src/op-scanner.c, src/field-ops.c, src/datamash.c: Changed 'error(EXIT_FAILURE,...)' to 'die(...)'. 2017-05-09 Assaf Gordon build: require pkg-config>=0.28 when building from git configure.ac uses the PKG_CHECK_VALUE macro from the pkg-config' pkg.m4 file for bash-completion directory detection. This macro is not available in older pkg-config versions. Does is only required when building from git and running ./bootstrap. Building from tarball does not require pkg-config at all. * bootstrap.conf: Specify minimum version for pkg-config. 2017-05-03 Assaf Gordon build: remove bashcompdir from Makefile.am It is already defined in configure.ac (warning generated by autoreconf). * Makefile.am: Remove bashcompdir. 2017-04-19 Assaf Gordon doc: expand build instructions in README and HACKING.md Explain how to build from git. 2017-04-19 Assaf Gordon build: require 'pkg-config' when bootstrapping pkg-config is used to detect the bash-completion directory. It is optional during ./configure (and ./configure will succeed even if pkg-config is not installed), but pkg.m4 is required by ./bootstrap for the PKG_CHECK_VAR in 'configure.ac'. * bootstrap.conf: Add pkg-config requirement. 2017-04-06 Assaf Gordon gnulib: update to latest 2017-03-24 Barry Nisly datamash: new operation 'perc' (percentiles) Usage: $ seq 100 | datamash --header-out perc:95 1 perc:95(field-1) 95.05 See https://lists.gnu.org/archive/html/bug-datamash/2017-03/msg00000.html . * NEWS: Mention new operation. * src/datamash.c: (usage): Add new op. (print_column_headers): Print percentile value in headers. * src/op-defs.{h,c}: Define new OP_PERCENTILE. * src/field-ops.{h,c}: Implement new OP_PERCENTILE. * src/op-parser.c: (set_op_params): Handle 'perc:N' paramater. * tests/datamash-stats.pl: Test new operation. * tests/datamash-tests.pl: Test 'perc' headers. * doc/datamash.texi, man/datamash.x: Mention new operation. * tests/datamash-error-msgs.pl: Test 'perc' parameter error messages. 2017-03-17 Assaf Gordon build: install bash-completion script by default (if possible) If the 'bash-completion' package is instaleld with pkg-config support, detect the default bash-completion directory and install the datamash completion script there (typically: /usr/share/bash-completion/completions). Otherwise, install in /usr/local/share/datamash/bash-completion.d (as before). Disable with `./configure ---with-bash-completion-dir=no`. Set custom path with `./configure ---with-bash-completion-dir=PATH`. * README: Mention bash-completion options to './configure'. * configure.ac: Add --with-bash-completion-dir=DIR option, and autodetect bash-completion directory using pkg-config. * Makefile.am: Add bash-completion target (if enabled). 2017-03-17 Assaf Gordon maint: update .gitignore gnulib: update to latest 2017-01-20 Assaf Gordon maint: update bootstrap script to latest * bootstrap: Copy latest version from gnulib/build-aux/bootstrap. maint: update NEWS gnulib: update to latest 2017-01-11 Assaf Gordon build: fix out-of-tree builds without dep-tracking Create src,lib,doc,tests subdirectories at the end of 'configure', preventing build problems when building out of tree with --disable-dependency-tracking. Inspired by sed's https://bugs.gnu.org/25371 . * configure.ac: Call AS_MKDIR_P() to create subdirectories. 2017-01-11 Assaf Gordon maint: update copyright year to 2017 Automatic update with 'make update-copyright'. maint: add gnulib's 'update-copyright' module * bootstrap.conf: Add 'update-copyright'. * cfg.mk: Define 'update-copyright-env' with relevant values for 'make update-copyright' maint: update gnulib to latest * m4/.gitignore: Ignore few more m4 files. 2016-09-23 Assaf Gordon doc: fix man page escapes Suggested by Dave Love in http://lists.gnu.org/archive/html/bug-datamash/2016-02/msg00002.html 2016-09-23 Assaf Gordon datamash: bugfix: trailing delimiter should count as an empty field In version 1.1.0 and earlier, trailing delimiter at the end of the line did not count as an empty field: $ printf "a:b:\n" | datamash -t: check 1 line, 2 fields After bugfix: $ printf "a:b:\n" | datamash -t: check 1 line, 3 fields Related to bug report by Sanjeev Kumar Sharma in http://lists.gnu.org/archive/html/bug-datamash/2016-09/msg00000.html 2016-09-23 Assaf Gordon datamash: 'transpose' bugfix (missing values in last line) Missing values in the last line would cause fields to be silently dropped: $ printf "1\t2\na\n" | datamash --no-strict transpose 1 a After bugfix: $ printf "1\t2\na\n" | datamash --no-strict transpose 1 a 2 N/A 2016-01-17 Assaf Gordon maint: update date in NEWS * NEWS: update release version, metion version bump. doc: add 'field delimiters' section to manual * doc/datamash.texi: add new 'field delimiters' section. 2016-01-03 Assaf Gordon maint: update gnulib to latest version * gnulib: update to latest version with copyright year adjusted * tests/init.sh, bootstrap: sync with latest gnulib. maint: update all copyright year number ranges 2016-01-01 Assaf Gordon maint: rename pmccabe makefile target name * Makefile.am: rename 'CYCLO_SOURCES' to 'cyclo_sources', to avoid confusing autoreconf about missing library 'CYCLO'. doc: add rounding,binning sections to manual doc: add 'crosstab' section to manual doc: add 'check' usage example doc: add 'groupby on /etc/passwd' usage example section doc: merge reverse+transpose nodes doc: added 'Usage Examples' chapter doc: reword 'overview' section doc: add custom CSS to the texinfo HTML output * doc/datamash-texinfo.css: new custom styles * doc/local.mk: use new CSS file in 'make html' * Makefile.am: use new CSS file in 'make web-manual' doc: update manual with new operations 2015-12-26 Benno Schulenberg datamash: improve an error message datamash: gettextize an option description as a whole datamash: remove redundant header, and refer correctly to the man page datamash: improve the punctuation and grammar of the usage text 2015-12-20 Assaf Gordon doc: update NEWS for upcoming version maint: update gnulib submodule to latest version 2015-12-18 Assaf Gordon datamash: new operation: strbin (binning strings to numeric value) hashes a text string into a number, returns values from 0 to #bins (default 10). Can be used to bin many free-text input values into a finite (small) set of bins. Identical strings will be hashed to same numeric values. Examples: assign a bin (0 to 9) to a list of strings (bin number added as last field): seq 100 | xargs printf "user-%d\n" \ | datamash --full strbin 1 identical input values will be assigned identical bin number: ( seq 100 ; seq 20 ) | xargs printf "user-%d\n" \ | datamash --full strbin 1 Splitting input values into 10 output files, while keeping identical keys in the same file: ( seq 100 ; seq 20 ) | xargs printf "user-%d\n" \ | datamash --full strbin 1 \ | awk '{print $0 > $NF ".txt"}' * bootstrap.conf: add gnulib's hash-pjw-bare module * src/op-defs.{c,h}: add new op constant * src/field-ops.{c,h}: implement new operation * src/op-parser.c: strbin can take a parameter (# bins) * src/datamash.c: update usage() * man/datamash.x: update man page * contrib/bash-completion/datamash: add new op * tests/datamash-error-msgs.pl, tests/datanash-strbin: test new operation * Makefile.am: add new tests 2015-12-18 Assaf Gordon maint: use gnulib's pmccabe2html module Generate report with: make cyclo-datamash.html * bootstrap.conf: add the pmccabe2html module * Makefile.am: add new target * .gitignore: ignore generated html file 2015-11-13 Assaf Gordon maint: remove superfluous 'with' from doc's "invariant section" pointed by Debian/Ubuntu package maintainer (alejandro@debian.org). * doc/datamash.texi: remove 'with' from 'invariant section' of GFDL to comply with official wording. 2015-07-14 Assaf Gordon build: prefer %PRIuMAX to %zu in printf %zu doesn't play well with mingw cross-compilers. * src/column-headers.c, src/op-parser.c: include , use %PRIuMAX instead of '%zu' in printf . 2015-07-14 Assaf Gordon build: add gnulib's stpcpy module (for mingw) * bootstrap.conf: add stpcpy. 2015-07-14 Assaf Gordon datamash: fail on invalid cov/pearson input Unequal number of items can occur if ignoring N/A values. * src/field-ops.c: fail if the two fields have different number of items. * tests/datamash-pair-tests.pl: tests such inputs. 2015-07-14 Assaf Gordon datamash: fail on incorrect usage * src/op-parser.c: fail when primary operations are given fields ranges or pairs (e.g. 'groupby 1:2'). fail when crosstab is given more than one operation. * tests/datamash-{crosstab,error-msgs,parser}.pl: add new tests. doc: improve man page (new ops, examples) * man/datamash.x: mention new operations, use 'field' instead of 'column', add more examples, improve formatting. datamash: improve usage() * src/datamash.c: improve usage(), mention new operations, use 'field' instead of 'column'. 2015-07-14 Assaf Gordon new operations: round,floor,ceil,trunc,frac typical usage: $ (echo X ; seq -1 0.5 1 ) \ | datamash -H --full round 1 floor 1 ceil 1 trunc 1 frac 1 X round(X) floor(X) ceil(X) trunc(X) frac(X) -1.0 -1 -1 -1 -1 0 -0.5 -1 -1 0 0 -0.5 0.0 0 0 0 0 0 0.5 1 0 1 0 0.5 1.0 1 1 1 1 0 * bootstrap.conf: add gnulib's floorl,ceill,roundl,modfl modules. * src/op-defs.{c,h}: add new operations constants. * src/field-ops.c: handle new operations. * tests/datamash-tests-2.pl: add tests. * contrib/bash-completion/datamash: add new operations. 2015-07-14 Assaf Gordon utils: add functions for long-double zero handling * src/utils.c: is_zero(), is_signed_zero() - retruns true/false. pos_zero(): negates a negative zero, nop otherwise. 2015-07-14 Assaf Gordon new operation: bin (binning into buckets) typical usage (bin into buckets of size 5): $ seq 0 10 | ./datamash bin:5 1 0 0 0 0 0 5 5 5 5 5 10 * src/op-defs.{c,h}: add new operation constant and name. * src/op-parser.c: special handling for operation with an optional parameter. * src/field-ops.h: add optional parameter to struct fieldop. * src/field-ops.c: implement binning operation. * tests/*.pl: add tests 2015-07-14 Assaf Gordon datamash: reword error messages * src/op-parser.c: use 'field' instead 'column', standarize error messages. * tests/*.pl: adapt tests to new wording. 2015-07-14 Assaf Gordon build: fix long-double precision under NetBSD Under NetBSD (and others?) default precision is set to 'double' even for 'long double' variables. This results in incorrect results for 'roundl' and crahes for 'expl' (when using gnulib's implementations. Force 'long double' precision. See gnulib's 'lib/fpucw.h' file for many more details. * bootstrap.conf: add fpucw module. * src/datamash.c: force 'long double' precision. 2015-07-14 Assaf Gordon tests: add rmdup expensive tests * src/datamash.c: add undocumented option to reduce initial size of rmdup structures (forcing more frequent reallocations). * tests/datamash-valgrind.sh: use new option, validate results. tests: add pcov with header * tests/datamash-pair-tests.pl: tests pcov (taking two fields) with output headers (print 1 field, 1 header). 2015-07-14 Assaf Gordon refactor: create new scanner module Use proper tokenizing instead of relying on argc/argv 'tokens' with ad-hoc string manipulation. In the future, this will allow accepting the script as on string (like sed's -e). * bootstrap.conf: add gnulib's c-type module * src/op-scanner.{c,h}: new scanner module, improved tokenizing. * src/op-parser.c: use new scanner, support _STANDALONE_ mode for quicker testing (without autotools,gnulib). * src/op-defs.c: prefer name 'groupby' to 'grouping' when reporting errors. * src/system.h: support_STANDALONE_ mode for quicker testing (without autotools,gnulib). * tests/datamash-{tests,parser,crosstab}.pl: adapt error message text to new parser (no change in functionality). * Makefile.am: include new scanner. * po/POTFILES.in: include new scanner. 2015-06-26 Assaf Gordon build: move 'assert.h' to 'system.h' * src/op-defs.c: move 'assert.h' from here... * src/system.h: to here. avoids 'syntax-check' warings about including 'assert.h' without using it (it is used indirectly with 'internal_error()' macro). build: conditionally include minmax.h Don't include "lib/minmax.h" if MAX is already defined. Seems redundant (per the comment in minmax.h), but tcc 0.9.26 still fails. maint: force -std=c99 with static analysis * cfg.mk: force -std=c99 when using clang's scan-build program, as it does not to do so automatically, and compliation fails. maint: improve configure messages * configure.ac: simplify printing of examples directory; print bash-completion directory. 2015-06-26 Assaf Gordon datamash: improve,install bash-completion script By default, script installed to program's datadir, e.g. /usr/local/datamash/share/bash-completion.d/datamash and the user can manually source it from ~/.bashrc. Optionally, can be installed to another location, e.g. ./configure --with-bash-completion-dir=/etc/bash_completion.d to be automatically loaded. * contrib/bash-completion/datamash: fix script, improve suggestions, do not use '_init_completion' (not available in bash-comp<1.9) * configure.ac: add option --with-bash-completion-dir * Makefile.am: install script. 2015-06-26 Assaf Gordon datamash: new operation: Pearson Correlation usage (calculate pearson (population) correlation coefficient on fields 1,2): datamash ppearson 1:2 < 1.txt * src/op-defs.{c,h}: new op enums. * src/op-parser.c: handle new op. * src/utils.{c,h}: implement pearson correlation function. * src/field-ops.c: handle new op. * tests/datamash-pair-tests.pl: add tests 2015-06-26 Assaf Gordon datamash: new operation: covariance usage (calculate population-covariance on fields 1,2): datamash pcov 1:2 < 1.txt * src/op-defs.{c,h}: new op enums. * src/op-parser.c: handle new operation, with special parsing for column pairs (e.g. '1:2'). * src/field-ops.{c,h}: implement new operation, and add master/slave field-op options (as covariance is implemented as two linked field ops, one for each column). * src/datamash.c: skip 'slave' field-ops when printing. * tests/datamash-parser.pl: test column-pairs parsing * tests/datamash-pair-tests.pl: test column-pair operations. * tests/datamash-valgrind.sh: test covariance operation. * Makefile.am: include new tests. 2015-06-26 Assaf Gordon datamash: utils: add covariance implementation * src/utils.{c,h}: covariance_value() - new function. 2015-06-26 Assaf Gordon datamash: improve parser, allow field range allows: datamash sum 1-5,7-9 * src/op-parser.c: improve field parsing * tests/datamash-parser.pl: add new tests 2015-06-26 Assaf Gordon datamash: new feature: check reports back number of lines,fields - or exit with failure if the input lines do have have same number of fields. usage: $ seq 10 | paste - - | datamash check 5 lines, 2 fields $ seq 9 | paste - - | datamash check line 4 (2 fields): 7 8 line 5 (1 fields): 9 ./datamash: check failed: line 5 has 1 fields (previous line had 2) * src/datamash.c: tabular_check_file() - new function. * src/op-defs.{h,c}: new mode enum. * src/op-parser.c: handle new mode. * tests/datamash-check-tabular.pl: add tests. * Makefile.am: add new test file. 2015-06-26 Assaf Gordon maint: git-ignore more files 2015-06-26 Assaf Gordon datamash: new feature: crosstab usage: datamash crosstab 1,2 sum 3 * src/crosstab.{c,h} - implementation * src/datamash.c - handle new crosstab processing mode. * src/op-defs.{c,h} - additional crossab constants. * src/op-parser.c - special handling for crosstab mode. * tests/datamash-crosstab.pl - new tests. * Makefile.am - add new files. 2015-06-26 Assaf Gordon maint: add GL_ATTRIBUTE_PURE where needed maint: fix NEWS syntax-check * NEWS: add missing empty line. 2015-06-26 Assaf Gordon maint: add coverage hints 1. Add LCOV_EXCL_BR_LINE to skip few switch cases where the default case should never be reached. 2. Add LCOV_EXCL_LINE. LCOV_EXCL_BR_LINE requires geninfo from lcov>=1.11 . 2015-06-26 Assaf Gordon maint: remove unused strcnt module datamash: improve parser module, add tests * src/datamash.c: remove redundant left-over code. * src/op-parser.c: improve token processing * src/op-defs.c: rename 'grp' alias to 'gb' (=groupby) * tests/datamash-parser.pl: new tests * Makefile.am: add new tests. maint: fix syntax-check errors datamash: move config vars to text-options * src/datamash.c,src/field-ops.c: move several configuration parameters from here ... * src/text-options.{c,h}: ... to here. maint: fix empty variable init * src/field-ops.c: CLANG complained about '{0}'. replace with memset(). refactor: move field-operation list. * src/field-ops.{c,h}: refactored to handle just one struct at a time. private functions removed from the header file. * src/op-parser.{c,h}: the list of field-ops is handled here. * src/datamash.c: adapted as needed. 2015-06-26 Assaf Gordon refactor: create parsing module 1. Better handling of processing modes (e.g. group-by/transpose/reverse/rmdup) vs operations (e.g sum/count/min/max/mean). 2. Support new syntax: datamash groupby 1,2 sum 3 equivalent to: datamash -g 1,2 sum 3 * src/op-defs.{c,h} - list of processing modes and operations. * src/op-parser.{c,h} - parsing module. returns structure with parsed information (mode, grouping fields, operations, etc.). * src/op-fields.{c,h} - adapt to new parsing, remove old pasring functions. * src/datamash.c - adapt to new parsing module. * tests/*.pl - minor changes to error messages. 2015-06-26 Assaf Gordon refactor: extract group-processing to a function * src/datamash.c: process_group(): new function to print/summarize each group. refactor: fields-ops printing * src/field.ops{h,c}: field_op_summarize_empty(), field_op_summarize(): store results in 'op->out_buf', don't directly print to stdout. summarize_field_ops(), field_op_print_empty_value(): print buffer to stdout. refactor: print empty values (for testing) * src/datamash.c: move empty value printing code from here ... * src/field-ops.{h,c}: ... to new function field_op_print_empty_value(). 2015-06-22 Assaf Gordon maint: add NEWS about 1.0.7 features 2015-05-29 Assaf Gordon build: update gettext/autoconf/automake versions * bootstrap.conf: require gettext>=0.19.4, autoconf=>2.62, automake>=1.11.1. * configure.ac: require gettext=>=0.19.4 * po/ChangeLog: updated to mention gettext 0.19.4 by 'gettextize -f' 2015-05-28 Assaf Gordon maint: update gnulib to latest version. specifically, for mingw fixes: http://lists.gnu.org/archive/html/bug-gnulib/2015-05/msg00026.html http://lists.gnu.org/archive/html/bug-gnulib/2015-05/msg00030.html http://lists.gnu.org/archive/html/bug-gnulib/2015-05/msg00031.html 2015-05-27 Assaf Gordon maint: improve printf portability, remove mingw hack * bootstrap.conf: use gnulib's inttypes,stdint,extensions modules. See http://lists.gnu.org/archive/html/bug-gnulib/2015-05/msg00021.html . * configure.ac: remove mingw hack. gnulib's extensions module should take care of making PRIuMAX work correctly. * src/datamash.c: use PRIuMAX as format speficier in printf, instead of previous ugly mingw hack (which also didn't work with translations, see http://lists.gnu.org/archive/html/bug-datamash/2015-05/msg00001.html ). maint: avoid compiler warning about no-return. * src/fields-ops.c: add 'return' which should never be reached. 2015-05-20 Assaf Gordon tests: improve portability of perl Digest::MD5 On some systems (Centos7, Fedora21), Perl's Digest::MD5 is not installed by default (despite being official a core module). If so, skip the MD5 tests instead of failing. * tests/datamash-md5.pl: separate tests file for MD5. * tests/datamash-tests{,-2}.pl: don't use Digest::MD5 by default. * Makefile.am: add the new test file 2015-05-19 Assaf Gordon maint: improve tests portability of inf/nan Depending on the OS/libc, printf(Not-a-number) can be 'nan' or 'NaN'. printf(Infinity) can be 'inf', 'Inf', 'Infinity'. Instead of guessing it, add undocumented option to print the value on the current system. * src/datamash.c: add undocumented options to print the values of inf/nan/progname. * tests/datamash-{tests,tests-2,stats}.pl: use the undocumented option as the expected result string of various tests. 2015-01-18 A. Gordon build: update to lastest gnulib * gnulib: Update to latest * tests/init.sh: update from gnulib 2015-01-18 Benno Schulenberg doc: fix some inconsistencies doc: do not mention the nonexistent options -h and -v 2015-01-18 A. Gordon maint: update copyright year to 2015 maint: enable po translation * bootstrap.conf: remove SKIP_PO * .gitignore: ignore fetched po/ files 2014-11-27 A. Gordon datamash: new option '--narm': ignore NA/NaN values Suggested by Brandon Invergo: http://lists.gnu.org/archive/html/bug-datamash/2014-11/msg00002.html * src/utils.{h,c}: is_na() new function * src/field-ops.{c,h}: field_op_collect(): detect and skip NA/NaN values. field_op_summarize_empty(): print results of empty values. field_op_summarize(): call above function if no values. * src/datamash.c: update usage(), main() * doc/datamash.texi: mention new option * tests/datamash-tests-2.pl: add tests for new option. 2014-10-06 A. Gordon maint: update syntax-check rule to allow digits * cfg.mk: change regex in 'sc_prohibit_test_backticks' rule to allow digits in file names of tests. Also remove two unrelated file names (left-overs from 'coreutils') 2014-10-06 A. Gordon datamash: keep correct lines with '--full' Discussed in https://github.com/agordon/datamash/issues/3 , and also existed in (very) old revisions of 'calc'. When using a selection operation (first/last/min/max/etc) which selects one line out of the group, and combining with --full, this ensures the correct line is selected (not just the correct fields intemixed with the values of the first line). * src/datamash.c: keep the correct line * src/field-ops.{c,h}: for selection operation, determine whether to keep the line or not. * src/datamash-tests.pl: adapt one test of 'lst' to new behavior. * src/datamash-tests-2.pl: tests new --full behavour 2014-10-06 A. Gordon maint: add 'static-analysis' make target Example: make static-analysis or: make static-analysis-configure make static-analysis-make Uses clang's scan-build static analysis tool to analyze the code. Use the two-step method to change code and re-analyze without re-runnning './configure' again. 2014-09-12 A. Gordon build: Enable build using Mingw cross-compilers * configure.ac: add flag and auto-detection for mingw * Makefile.am: add compilation flag * src/datamash.c: Use conditional printf type for size_t ("%zu" for all unixes, "%Iu" for mingw) * HACKING.md: mention this issue * build-aux/check-remote-make-extra.sh: add mingw checks maint: add gnulib 'strsep' and 'random' modules These modules were missing when cross-compiling with mingw. * bootstrap.conf: add missing modules * m4/.gitignore: add files 2014-08-22 A. Gordon maint: expand HACKING.md file * HACKING.md: add notes. * cfg.mk: skip syntax-check on HACKING.md for few rules. datamash: improve delimiter edge-case handling * src/datamash.c: reject NUL delimiter, prevent shell-quoting problems when sorting with single-quote delimiter. * tests/datamash-tests.pl: add tests for these edge-cases. datamash: improve '--group' parameter parsing * src/strcnt.{c,h}: new utility function to count characters in string. * Makefile.am: add new module. * src/datamash.c: parse_group_spec(): improve parsing code. * tests/datamash-tests.pl: add more tests. 2014-08-12 Assaf Gordon datamash: new feature: accept named columns Examples: datamash -H sum price < input.txt (If 'input.txt' has a column named 'price'). datamash -H -g id sum 2 < FILE (assuming FILE has column named 'id') Suggested by Shaun Jackman (sjackman@gmail.com). * bootstrap.conf: use gnulib's xstrndup module. * src/field-ops.{h,c}: if column parameter is not a valid number, keep parameter as string, and defer finding the number to later. * src/column-headers.{h,c}: add a function to get the column number by its name. * src/datamash.c: process_file(): if some column are named, try to find the columns' number based on the header line, or exit with an error. parse_group_spec(): optionally use names. process_input_header(), remove_dups_in_file(): if grouping/operations use named columns, find column number after reading the header line. usage(): mention named columns. * tests/datamash-tests.pl: adapt existing tests to new error messages, add new tests for this feature. * man/datamash.x: add named-columns examples. 2014-08-12 Assaf Gordon maint: improve 'syntax-check' 2014-08-09 A. Gordon maint: new 'coverage-expensive' target * cfg.mk: add 'coverage-expensive' target, for better coverage. maint: improve build-check scripts * build-aux/check-remote-make-all.sh: accept -c/-b/-m/-e arguments and pass them to 'check-remote-make.sh' script. * build-aux/check-remote-make-extra.sh: check cross-compilation and other compilers * build-aux/check-remote-make-git.sh: check building from git repository. * build-aux/prerelease-checks.sh: use new script. 2014-08-09 Assaf Gordon tests: test more rmdup edge-cases datamash: fix header line edge-case * src/datamash.c: if the input contains a header line but no further lines, print the output header line correctly. * tests/datamash-tests.pl: add tests for this edge case. 2014-08-09 A. Gordon datamash: improve header handling in rmdup/reverse * src/datamash.c: handle input/output header combinations. * tests/datamash-tests.pl: add tests * tests/datamash-transpose.pl: add tests maint: add git-log-fix file * build-aux/git-log-fix: this file is used by gnulib's build module, but after gnulib ugprade it was removed. Add it back. datamash: improve debase64 error handling * src/field-ops.{c,h}: return error code on failed base64 decoding. * src/datamash.c: handle decoding errors. * tests/datamash-tests.pl: test failed base64 decoding. * tests/datamash-valgrind.sh: test base64 encoding/decoding on large input. 2014-08-09 Assaf Gordon maint: improve output of parallel-tests * Makefile.am, init.cfg: redirect stderr to file-descriptor 9, improving error reporting using automake's parallel-tests. Based on similar setttings in GNU coreutils. build: add PURE attribute to functions build: add compiler warnings, -Werror build: speedup with gnulib's unlocked-io module * bootstrap.conf: add gnulib's unlocked-io module, providing significant speed improvement with readlinebuffer_delim(). 2014-08-09 A. Gordon datamash: improve field-parsing code * src/text-lines.c: line_record_parse_fields(): revise code in case of single-character delimiter. 2014-08-09 A. Gordon datamash: new operations: rmdup, noop rmdup: remove lines based on duplicated keys. Similar to: awk '!a[$1]++' noop: no-operation (optionally with --full: print file as-is). Used for testing and profiling. * bootstrap.conf: use gnulib's hash module. * src/field-ops.{c,h}: add new operations * src/utils.{h,c}: hash_compare_strings(): helper function for gnulib's hash module. * src/datamash.c: noop_file(), remove_dups_in_file(): functions for new operations; main(): call new functions. usage(): mention new operation. * tests/datamash-valgrind.sh: test 'rmdup' operation with large input. * man/datamash.x: mention new operation, add example. 2014-08-09 A. Gordon maint: improve syntax-check * src/field-ops.c: replace tabs with spaces. datamash: improve assert and code coverage Asserts impossible conditions, and exclude from code coverage. tests: test more edge-case datamash: refactor operation output type * src/field-ops.{h,c} - define the output type (numeric/string) in the operations table, instead of during runtime. maint: fix/improve 'make coverage' * cfg.mk: force deletion of '*.gcno/*.gcda' when using 'make coverage'. Otherwise, some failures occur with lcov reporting 'unexpected end of file'. maint: fix syntax-check, part 3 maint: fix syntax-check issues, part 2 maint: fix source files for 'syntax-check', part 1 maint: add few more syntax-check rules * cfg.mk: copy syntax-check rules from GNU coreutils. build: update gnulib submodule to latest build: disable --debug option * configure.ac: remove --enable-debug option. * src/*.c: remove #ifdef debug code. 2014-08-09 Assaf Gordon datamash: refactor field-splitting Instead of using gnulib's linebuffer directly, handle line-reading and field-splitting with a new structure 'line_record_t' . Field-splitting by delimiter is performed once, automatically, when a line is read. Pointers to each field are stored in 'line_record_t'. * src/text-lines.{c,h}: implement 'line_record_t' functionality. * src/datamash.c: replace 'linebuffer' with 'line_record_t' . * src/column-headers.{c,h}: use new functionality. * tests/datamash-tests.pl: adapt tests for new error messages. * tests/datamash-valgrind.sh: fix field delimiter in tests. 2014-08-08 Assaf Gordon maint: add 'default' case to prevent compiler warning datamash: bugfix: detect invalid column values * src/field-ops.c: add more checks * tests/datamash-tests.pl: add more tests tests: tests few more edge-cases datamash: improve field-op output handling * src/field-ops.{c,h}: pre-allocate buffer for string output when initializing the field-op, instead of malloc/free on every print. field_op_reserve_out_buf(): ensures the field-op's output buffer is large enough to hold the resulting string; field_op_to_hex(), unique_values(), count_unique_values(), collapse_values(), field_op_summarize(): use new output buffer. 2014-08-08 A. Gordon datamash: new line operations: md5/sha*, base64 Line-Operations work on each line on the file (no grouping by key). Example: md5 on the first column on a file: datamash md5 1 < FILE Is similar to: perl -MDigest::MD5=md5_hex -lane 'print md5_hex($F[0])' < FILE * bootstrap.conf: add gnulib's md5/sha*/base64 modules * src/field-ops.{c,h}: implement new operations * src/datamash.c: process_file(): implement new 'line mode' processing; usage() mention new options. * man/datamash.x: mention new operations * doc/datamash.texi: ditto * tests/datamash-tests.pl: test new operations * tests/datamash-sha.pl: test sha1/256/512 opertions (requiring recent perl module Digest::SHA, this test might be skipped on systems with old Perl). * Makefile.am: add new test file. 2014-08-03 A. Gordon maint: fix wrong path in previous commit maint: added bash-completion script * contrib/bash-completion/datamash - bash completion script * Makefile.am: include script in distribution (without installing it) * .gitignore: ignore the binary executable only 2014-07-30 A. Gordon maint: update NEWS maint: gitignore more files maint: add pre-release build-and-check script maint: improve check-and-build scripts * build-aux/check-remote-make.sh: report system inforation, accept custom git branch to check-out. * build-aux/check-remote-make-all.sh: add more hosts. tests: skip some tests if perl isn't found * configure.ac: test for Perl, setup PERL_FOUND automake varaible. * Makefile.am: enable parallel tests for pl,sh files, and use a stub to skip tests if Perl is not found on the system. tests: mark valgrind test as 'expensive' 2014-07-27 A. Gordon tests: improve portability under qemu/binfmt When building with cross-compiling AND running (non-native) binary locally with qemu/binfmt, the argv[0] (binary name) is changed by qemu. argv[0] is used for error reporting. Example with most systems, argv[0] is exactly the same as the user entered: $ datamash --foobar datamash: invalid option --foobar $ /custom/path/datamash --foobar /custom/path/datamash: invalid option --foobar But under qemu/binfmt, argv[0] becomes the full path: $ datamash --foobar /home/gordon/projects/datamash/datamash: invalid option --foobar The tests were modified to detect the actual path reported by the program, or fall back to 'datamash' (as it was before). 2014-07-27 A. Gordon maint: improve build-aux check scripts * renamed remote-make-check.sh => check-remote-make.sh * renamed remote-make-check-all.sh => check-remote-make-all.sh * check-remote-make.sh: added features: 1) more source: local/remote tarball, git 2) more compressions: gz/bz2/xz 3) building from git with './bootstrap' 4) command-line options to set parameters to env/configure/make 2014-07-27 A. Gordon maint: improve debian-hardening build method * configure.ac: move option from './configure' to .. * cfg.mk: make rules. Instead of: ./configure --enable-debian-hardenning Now use: make deb-hard 2014-07-27 A. Gordon man page: improve hyphen vs minus characters * man/datamash.x: escape minus characters where needed, to distinguish between hyphens and minus/dash. Suggested by Alejandro Garrido Mota tests: test I/O errors * tests/datamash-io-errors.sh: tests I/O errors using specially pre-mounted filesystems. * Makefile.am: add test. * build-aux/create_corrupted_filesystem.sh: script to create corrupted ext3 filesystem image. * build-aux/create_small_filesystem.sh: script to create small ext3 filesystem which can be filled easily. 2014-07-25 A. Gordon datamash: add 'internal_error' macro * src/system.h: define 'internal_error' macro. * src/field-ops.c: Replace 'error' call with 'internal_error' macro. * src/datamash.c: ditto. build: don't re-generate manpage from tarball * configure.ac: search for '.git' directory, implying building from git (if found) or tarball (if not found). * Makefile.am: If building from GIT, re-generate manpage. If building rom tarball, assume the manpage is already there, do not regenerate it, and do not clean it (with 'make clean'). 2014-07-23 Assaf Gordon datamash: new operations: transpose, reverse * src/datamash.c: new functions: transpose_file(), reverse_fields_in_file(). main(): parse new options usage(): update help screen * src/field-ops.{h,c}: new function: parse_operation_mode(). parse_operations() renamed to parse_grouping_operations(). * man/datamash.x: updated man-page information * doc/datamash.texi: updated texinfo manual * tests/datamash-transpose.pl - new tests * tests/datamash-valgrind.sh - new tests * Makefile.am: include new tests 2014-07-18 A. Gordon datamash: fix typo in help screen * src/datamash.c: usage(): fix 'pto' => 'dpo' typo. 2014-07-17 A. Gordon build: fix configure's message of examples path * configure.ac: Report correct path of installed examples ( PACKAGE_NAME = "GNU Datamash" while PACKAGE = "datamash" ). build: fix configure.ac parameter typo * configure.ac: fix param definition (otherwise --disable-debug would fail). build: add build prerequisites * bootstrap.conf: add gettext, gperf, makeinfo as prerequisites when building from git source. tests: test headers with whitespace delimiters * tests/datamash-tests.pl: add test for header line with whitespace delimiters. datamash: simplify line buffer handling * src/text-lines.c: linebuffer_nullify(): based on readlinebuffer_delim(), there is always a EOL-delimiter present in the buffer. tests: add Sample-Skewness test * tests/datamash-stats.pl: test sample-skewness with just 2 data points. maint: improve internal errors * src/field-ops.c: don't gettextize internal errors (no need to translate them), and exclude them from coverage reports. * src/utils.c: exclude internal error from coverage reports. datamash: improve gettext strings * src/datamash.c: usage(): improve wording, gettextize missed string, line up long options. Based on suggestion by Benno Schulenberg from the Translation Project. 2014-07-16 A. Gordon maint: update NEWS and README maint: syntax-check fix * Makefile.am: reduce spaces to appease 'make syntax-check' rule. build: Explicitly use perl for help2man * Makefile.am: Use 'perl' executable in $PATH for help2man, don't count on #! path in help2man - Hydra/NixOS doesn't have "/usr/bin/env", so it can't be used. NOTE: Using customized help2man (see commit 33c75d), so don't use system's installed help2man. 2014-07-15 A. Gordon Expand manual page * man/datamash.x: expand manual page with content from old help screen. datamash: shorten help screen src/datamash.c: usage(): shorten help screen. build: man page depends on .x template * Makefile.am: make manual page generation depend on datamash.x template. build: accept single example in help2man * man/help2man: Modify the EXAMPLES pattern detection to accept a single example. 2014-07-15 A. Gordon build: don't hard-code Perl path in scripts * man/help2man - Use "/usr/bin/env perl" instead of "/usr/bin/perl". * tests/datamash-stats.pl - ditto * tests/datamash-tests.pl - ditto Hard-coded Perl fails Hydra/NixOS builds (http://hydra.nixos.org/build/12516876), and is also generally bad form. 2014-07-06 A. Gordon GNU Datamash: remove files relating to external sites GNU Datamash: update README GNU Datamash: rename package 2014-05-20 A. Gordon maint: improve comments in tests 2014-05-13 A. Gordon maint: add 'texlive' package to Travis-CI maint: add 'texinfo' package to travis-CI 2014-05-13 A. Gordon doc: add texinfo documentation 2014-04-30 A. Gordon maint: update copyright/license in files 2014-04-19 A. Gordon maint: update 'git-log-fix' due to gnulib upgrade. gnulib: upgrade to latest version maint: add more hosts to build-check script maint: .gitignore more files maint: fix 'make syntax-check' errors maint: replace old bcopy/bzero calls with memmove/memset 2014-04-15 A. Gordon maint: add cygwin distribution packing script build: work-around strtold() bug on cygwin build: remove unneeded header from src/compute.c Compilation failed on Cygwin with this header. maint: improve automatic checks script Formalize rarely used debug option To build with debug option: ./configure --enable-debug To use: compute --debug [other options] build: fix compilation on Cygwin (missing nanl/expl) 2014-04-12 A. Gordon maint: improve instructions in 'tag-new-version' script maint: .gitignore few more files maint: add script to build and test on multiple hosts. maint: Improve output of 'aws-upload' script portability: cater to systems without a stable sort tests: improve portability when detecting 'nan' It's 'nan' on most systems, but 'NaN' on some. build: portability: add missing header bzero() and bcopy() fail to compile on DilOS without it. tests: improve portability of mktemp Five X's fail on OpenBSD. compute: default to TAB instead of Whitespace as delimiter compute: fix typos 2014-04-11 A. Gordon maint: add script to build & check on remote machine tests: improve portability of error message detection. tests: don't assume 'seq' is installed. SKIP if not found. build: improve 'inline' declaration for two functions. They failed to build on OpenBSD. compute: improve help screen compute: new operation: D'Agostino-Pearson omnibus (dpo) normality test src/utils: adjust math functions for consistant nan results compute: new operation: Jarque-Bera normality test tests: test mad/madraw with unsorted input 2014-04-10 A. Gordon tests: tests more sequences compute: new operation: excess kurtosis compute: new operation: skewness (for pop./sample) src/utils: extract function for arithmetic mean. tests: document equivalent R code for stat tests + reorder. compute: new operations: mad,madraw tests: add tests for pop/sample stdev/variance compute: new operations: q1,q3,iqr compute: new operation: rand 2014-04-09 A. Gordon build: add -Werror to ./src/ files. tests: fix comments compute: new operations: first,last 2014-04-08 A. Gordon compute: set numeric output precision build: improve comment in configure.ac 2014-04-08 A. Gordon build: Use compiler warnings from gl_WARN_ADD Those were wrongly dropped when changed to non-recursive makefile. The new flags are used only with the 'compute' sources, but not with the gnulib sources. 2014-04-08 A. Gordon compute: detect invalid suffix in numeric input. fields-ops: remove 'keep_lines' feature. Perhaps will be added in future versions. compute: move numeric conversion into 'field-ops' module. 2014-03-22 A. Gordon build: add helper scripts to EXTRA_DIST build: work-around for missing '.tarball-version' See: http://lists.gnu.org/archive/html/bug-hello/2014-03/msg00017.html maint: update helper scripts build: include 'help2man' and use it. Not all systems have "help2man" installed. 2014-03-20 A. Gordon maint: add helper script to upload files to AWS S3 build: update helper scripts to new name 2014-03-19 A. Gordon RENAME: calc -> compute To avoid name conflicts with existing software. 2014-03-18 A. Gordon TravisCI - no need for GNU Sed 4.2.2 anymore. calc: bugfix on --sort without grouping. 'pipe_through_sort' was not cleared, caused errors on FreeBSD (all other systems seems not to mind calling pclose on a non-popen FILE). calc: sort+headers: don't use GNU sed. Implement unbuffered input to avoid GNU sed dependency. 2014-03-15 A. Gordon calc: fix 'make syntax-check' errors build: improve ./configure messages gitignore: ignore more files Automatically generated by gnulib 2014-03-14 A. Gordon Travis-CI: add 'make distcheck', disable encode binary No need to save the static binary from a Linux build - can be easily created anywhere. build: make-bin script: improve tar filename build: update gnulib build: add empty git-log-fix Will be needed if there are ever commit tyops/errors, see http://lists.gnu.org/archive/html/coreutils/2011-11/msg00004.html build: fix help2man issue on FreeBSD http://lists.gnu.org/archive/html/bug-hello/2014-03/msg00003.html 2014-03-13 A. Gordon Travis-CI: don't get git submodules. 'bootstrap' should do it. Travis-CI: build with debian hardening flags src/calc: (temporarily) ignore the return value of fwrite using gnulib's 'clouseout' module will check for write errors when the program terminates. build: add '--enable-debian-hardening' to './configure' report CPPFLAGS at the end of './configure' tests: minor change to improve portability Previous syntax failed on Mac-OS X. Switch to non-recursive Makefile 2014-03-11 A. Gordon Travis-CI: check coverage after successful build gitignore: ignore coverage files tests: tests sort-pipe failure coverage tests: exclude unreachable line build: remove unused module 'strnumcmp' src/field-ops.c: document unreachable code. src/field-ops.c: refactor free() code. tests: add tests, improve coverage src/column-headers.c: improve error message build: add scripts to generate coverage information build: define inline'd symbols in src/text-options.c Avoids 'undefined reference' errors when compiling with coverage instrumentation. 2014-03-10 A. Gordon Travis: improve build, add scripts 2014-03-09 A. Gordon build: add 'make-bin' script to distribution build: print message at the end of ./configure build: make GZ tarballs, not XZ. build: add required README file as a stub. .gitignore: ignore more files build: helper script to build static binaries README: update information, refer to website .gitignore: ignore build-related files calc: free column headers, add tests calc: bugfix for headers without grouping tests: don't use valgrind on static binaries 2014-03-08 A. Gordon calc: new global case-sensitive option '-i' + tests build: fix minor errors refactor: extract common functions into modules maint: update copyright year, add GPL calc: conditionally compile '--debug' option. To enable it, Use ./configure CFLAGS="-DENABLE_DEBUG" calc: improve help screen. examples: improve 'scores' example 2014-03-07 A. Gordon build: remove 'make distcheck' from Travis-CI. It fails for unclear reason. calc: remove extraneous 'const' Caused compilation errors with clang. build: Add ".travis.yml" for Travis-CI calc: fix sort+headers, add tests examples: use '-s' instead of piping to 'sort|' calc: implement auto-sorting, add tests calc: add stub for '--sort' option. build: remove 'key-compare' from PO list calc: simplify code, don't use key-compare module. calc: simply parameters, remove '--key' option. calc: add '-T' as shortcut for tab separator. Merge branch 'examples' examples: add example files. 2014-03-06 A. Gordon tests: fix valgrind test for 'make distcheck' tests: add valgrind tests calc: fix memory problems 2014-03-05 A. Gordon examples: new sub-directory tests: add countunique tests new operation: countunique 2014-03-04 A. Gordon tests: test single-line groups 2014-02-27 A. Gordon tests: check empty input bugfix: '--full' and no groups would return the wrong line. Returned the last line of the last group, instead of the first line of the last group (as it returns the first line of every other group). tests: add --header-in tests headers: don't print group headers with --full. tests: test count on non-numeric fields remove unused cruft (from header feature) bugfix: Allow 'count' to count non-numeric fields key-compare: make 'blanks' public. implement headers support. Implement output headers (without --full). Command-line processing and Help for 'headers' options. 2013-05-04 Assaf Gordon build: touch missing files 2013-04-27 Assaf Gordon build: use gl_WARN_ADD for compiler warnings key-compare: fix compilation warnings -Wunused-parameter and -Wswitch-default will trigger warnings. change code to avoid them. 2013-04-26 Assaf Gordon Merge branch 'gitchangelog' build: add gnulib's gitlog-to-changelog 2013-04-12 Assaf Gordon calc: improve help message for help2man gettext-ify the package autoconf: switch from gz to xz system.h: minor improvements 2013-04-10 Assaf Gordon calc: use gnulib closeout module Protect against write errors at the end of te program, e.g. 'calc XXX > /dev/full' (which should fail). Merge branch 'better_autotools' GNU-ify: add auto-generated man page cleanups: remove 'scripts' directory There are no add-on scripts for calc (at the moment). GNU-ify: update src/Makefile.am GNU-ify: update tests/Makefile.am GNU-ify: add AUTHORS,THANKS GNU-ify based on GNU-hello, step 1 (configure.ac) code cleanups and more comments add GPLv3+ text to source files 2013-04-09 Assaf Gordon some more comments minor code cleanups minor code cleanups calc: groupping support (with proper output) calc: process groups. calc: add tests for groups calc: re-arrange headers calc: compare lines using --key calc: prepare code for key-compare. calc: import key-compare form coreutils. 2013-04-06 Assaf Gordon calc: expand '--help' section README: add simple use-case. tests: more tests (groups) tests: fix automake for out-of-tree builds calc: more tests calc: more tests calc: initial testing framework run with: make check or: make check VERBOSE=yes or: make check VERBOSE=yes DEBUG=yes SAVE_TEMPS=yes calc: proper printing of output calc: field extractiong works 2013-04-05 Assaf Gordon calc: re-factor, prepare for field support. calc: bugfix for numeric values with groups calc: initial grouping, by empty line calc: re-implement unique/collapse calc: implement unique/unique-nocase calc: implement mode/antimode calc: implement stdev/variance, other fixes. 2013-04-04 Assaf Gordon calc: first shot at multi-valued numeric (median) calc: single-value operations working. calc: add stub input reading 2013-04-03 Assaf Gordon calc: create field/op structures calc: parse operation arguments calc: set GNU coding styles for VIM calc: process -z and --debug 2013-03-30 Assaf Gordon build system: fix auto generated version.c calc: add template code Add version, coyright, help, usage messages. gnulib: add modules. configure.ac: remove extra GCC checks gnulib doesn't compile with them. TODO: add GCC warnings to projec'ts build in src/Makefile.am. (but only if using GCC). added gnulib submodule Initial Commit