Bug 450724 – bsewavetool shell iteration capabilities

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 450724 - bsewavetool shell iteration capabilities


Summary:	bsewavetool shell iteration capabilities


Status:	RESOLVED FIXED

Product:	beast
Classification:	Other
Component:	tools
Version:	SVN trunk
Hardware:	Other Linux

Importance:	High enhancement
Target Milestone:	m0.7
Assigned To:	Beast Maintainers
QA Contact:	Beast Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2007-06-24 19:57 UTC by Stefan Westerfeld
Modified:	2007-10-13 19:15 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Proposed implementation for shell iteration (22.85 KB, patch) 2007-06-24 20:26 UTC, Stefan Westerfeld	needs-work	Details \| Review
New reworked patch for shell iteration (23.41 KB, patch) 2007-07-03 23:27 UTC, Stefan Westerfeld	none	Details \| Review
Reworked shell iteration patch with old xinfo semantic (17.92 KB, patch) 2007-07-04 20:33 UTC, Stefan Westerfeld	none	Details \| Review
New patch for shell iteration (17.89 KB, patch) 2007-07-06 16:25 UTC, Stefan Westerfeld	committed	Details \| Review
Unit test implementation for chunk keys (6.53 KB, patch) 2007-07-10 15:42 UTC, Stefan Westerfeld	none	Details \| Review
New unit test implementation for chunk keys (7.54 KB, patch) 2007-08-23 14:02 UTC, Stefan Westerfeld	committed	Details \| Review

Description Stefan Westerfeld 2007-06-24 19:57:31 UTC

The TODO says, it would be nice to be able to iterate over the chunks in a bsewave file with bsewavetool.

FROM TODO:

- loop chunks with shell script
--------------------------------
if user lists chunks:

   encode frequency to chunkkey with full float precision

if user matchs chunks (z.B. delete/xinfo):

   epsilon-comparision between chunk and key float frequency
--------------------------------

END TODO.

I'll suggest an implementation here.

Comment 1 Stefan Westerfeld 2007-06-24 20:26:33 UTC

Created attachment 90580 [details] [review]
Proposed implementation for shell iteration

Some remarks:
-------------

I choose to prefix the key with BWK0: ; this allows bsewavetool to detect whether the key is a valid key. Later, if the key format changes, we can use BWK1: BWK2: and so on to ensure that we don't decode garbage.

The key is machine independent, so in theory, you could use ssh or nfs in your scripts to combine systems with different endianness to perform a task (although I am not sure anyone will ever try this :).

I found an ugly problem in our error handling. A common strategy some bsewavetool commands use for deciding which chunks to (export|loop|clip) is to build a sorted frequency list, and then use

    for (list<WaveChunk>::iterator it = wave->chunks.begin();
         it != wave->chunks.end(); it++)
      if (all_chunks || wave->match (*it, freq_list))
        ...

to actually select the desired wave chunks. The problem with this strategy is that if a frequency is in the frequency list that is never matched against any chunk, bsewavetool will not report an error.

To fix this, I have put explicit loops before this main matching loop, to check manually that the frequency list does not contain any "bad" chunks. But if this is done anyway, then it might be more useful to keep the information which chunks shall be processed. My suggestion would be to introduce a selected() state into the WaveChunks. Then, first the chunks to process could be selected. Then, the main loop could be rewritten as:

    for (list<WaveChunk>::iterator it = wave->chunks.begin();
         it != wave->chunks.end(); it++)
      if (wave->selected())
        ...

Since match() cannot do error handling anyway, it should go away.

But this change could be done/discussed independent of this patch, as this patch at least works as expected (so this is more a refactoring issue).


Finally, I have rewritten and unified the chunk selection and the option parsing for -f -m --chunk-key (new) and --all-chunks. As a side effect, the xinfo chunk selection behaves differently than it used to. For instance:

bsewavetool xinfo foo.bsewave -f 330 volume=0.5 -f 440 volume=0.9

is no longer valid. Also

bsewavetool xinfo foo.bsewave -f 330 -f 440 volume=0.5

now affects BOTH chunks, instead of only the last chunk (440). This simplifies the code. For the user it makes all commands behave the same. Moreover, no functionality is lost, as the old behaviour can be achieved using multiple bsewavetool xinfo statements.

Comment 2 Tim Janik 2007-06-30 14:39:54 UTC

(In reply to comment #1)
> Created an attachment (id=90580) [edit]
> Proposed implementation for shell iteration

some comments about the patch:
- please avoid excessive newlines and please also avoid braces { one(); } around single statements.
- for output, please use g_print and g_printerr like the rest of the code, not printf.
- in general, we don't use all upper case vars, for OBFUSCATE, you could just as well use obfuscation_mask.
- if there is a possibility to have your editor automatically replace tabs with spaces, please turn it on. the tabs make your code hard to read in places.
- given that we can use Birnet types these days, you may write int/uint instead of gint/guint if you whish.
- the parsing of location vs. wave in XInfoCmd looks very complicated, it even introduces a new enum. please try to simplify the code, e.g. by keeping one bool that indicates wave seleciton and then catching the case where the wave is selected *and* freq_list is non empty.
- numeric operations on enums like (location <= BAD_LOCATION) should be abvoided for good style, enums and also their value mapping may change over time.
- i really don't like applying a patch that introduces 10 FIXMEs, rather than getting rid of them. that's a clear indication that the code needs reworking here.

>
 
> Some remarks:
> -------------
> 
> I choose to prefix the key with BWK0: ; this allows bsewavetool to detect

actually, your prefix is "BWK0:", and the ':' appears to be purely redundant.
also, this key opens up a new namespace, it's probably be better to use 'BSE'[0-9a-zA-Z]. that uses a known namespace, consumes one less char (the ':') and leaves room for 62 versions.


> whether the key is a valid key. Later, if the key format changes, we can use
> BWK1: BWK2: and so on to ensure that we don't decode garbage.
> 
> The key is machine independent, so in theory, you could use ssh or nfs in your
> scripts to combine systems with different endianness to perform a task
> (although I am not sure anyone will ever try this :).

it will, think e.g. of someone passing a bsewave on to someone else along with a script that modifies a subset of the chunks.

> I found an ugly problem in our error handling. A common strategy some
> bsewavetool commands use for deciding which chunks to (export|loop|clip) is to
> build a sorted frequency list, and then use
> 
>     for (list<WaveChunk>::iterator it = wave->chunks.begin();
>          it != wave->chunks.end(); it++)
>       if (all_chunks || wave->match (*it, freq_list))
>         ...
> 
> to actually select the desired wave chunks. The problem with this strategy is
> that if a frequency is in the frequency list that is never matched against any
> chunk, bsewavetool will not report an error.

similarly to how you call parse_chunk_selection() during argument parsing now, couldn't you simply call a functoin verify_chunk_selection() after the parsing that verifies the chunk selection matching the existing chunk list?

> To fix this, I have put explicit loops before this main matching loop, to check
> manually that the frequency list does not contain any "bad" chunks. But if this
> is done anyway, then it might be more useful to keep the information which
> chunks shall be processed. My suggestion would be to introduce a selected()
> state into the WaveChunks. Then, first the chunks to process could be selected.
> Then, the main loop could be rewritten as:
> 
>     for (list<WaveChunk>::iterator it = wave->chunks.begin();
>          it != wave->chunks.end(); it++)
>       if (wave->selected())
>         ...

i guess that is not needed if you simply have a function to verify the specified seleciton. unless there are other compelling reasons to not continue to maintain a seperate selection object.

> Finally, I have rewritten and unified the chunk selection and the option
> parsing for -f -m --chunk-key (new) and --all-chunks. As a side effect, the
> xinfo chunk selection behaves differently than it used to. For instance:
> 
> bsewavetool xinfo foo.bsewave -f 330 volume=0.5 -f 440 volume=0.9
> 
> is no longer valid.

well, the code was initially written to specifically support assigning different xinfos to different chunks in one pass. the main reason here is that calling bsewavetool repeatedly to achieve the same effect would be *much* slower, since multiple megabytes may need to be rewritten each time.

Comment 3 Tim Janik 2007-06-30 15:39:52 UTC

(In reply to comment #1)
> I choose to prefix the key with BWK0: ; this allows bsewavetool to detect
> whether the key is a valid key. Later, if the key format changes, we can use
> BWK1: BWK2: and so on to ensure that we don't decode garbage.
> 
> The key is machine independent, so in theory, you could use ssh or nfs in your
> scripts to combine systems with different endianness to perform a task
> (although I am not sure anyone will ever try this :).

i've rethought the key issue some, and found:
- having a few bits (ca. 4) forintegrity checking would be nice in case someone mistypes a key.
- versioning is rather uninteresting for us, in the context of bsewavetool, we'll either use a float as wave chunk primary sort key, or in the future use *additional* fields as primary sort key. in that case however we don't need versioning for the key string, because it's length changes anyway.
- i don't really see the need for namespacing/prefixing if i look at similar concepts elsewhere, e.g. git commit ids or file hashes. interestingly, git does associate a tpye with hashes, however that can't be told from the hash but also needs to be extracted:
  $ git-cat-file -t 3705e37a7fdd8f1c87e4b8b01ae6500351005a81
  commit
- making the key shorter could be in the interest of the user if they need to be retyped or pasted.

as a result, it'd probably be better to store key strings like this:
- use a conscise format to map from 32+4bit (our current key string width plus integrity bits) to ascii characters, e.g. [a-zA-Z0-9]+
- use a bit-position dependent hash algorithm (i.e. not just XOR) to create the integrity bits, e.g. printf %d + g_str_hash() % nbits might be good enough here.

Comment 4 Stefan Westerfeld 2007-07-03 23:27:33 UTC

Created attachment 91147 [details] [review]
New reworked patch for shell iteration

Brief list of changes:
- improve whitespace (expand tabs setting in vim), remove some newlines/braces
- replace printf by g_print
- make the keys work the way you suggested, they look nicely randomized now and have a 9bit checksum, while being only 7 [0-9][a-z][A-Z] bytes long; here is a sample output
stefan@lotrien:/usr/local/src/beast/tools$ bsewavetool list-chunks ../r.bsewave
LOAD: ../r.bsewave
EXEC: list-chunks
Kg6jgDS
lQMsUt2
IvJ08KR
QyfXkkB
Gr585dW
cSLKpSB
SCdY7ja
SAVE: ../r.bsewave
- get rid of the FIXMEs
- rewrite my clumsy xinfo args parser
- use uint (it is currently not possible to use uint64 in bsewavetool.cc, try yourself and you'll see that gcc complains about conflicting declarations for uint64)

When I got everything right the only remaining issue is whether the new xinfo code is too slow in real world scenarios. My approach would be: apply the patch and wait if some day someone complains, and then optimize or put the old, more complex code back.

Its not sure that being able to set many xinfos at once buys you anything, because there are many other operations that are still carried out sequentially, so if you really want high speed then opening the wave once, performining all operations, and then closing the wave (like the gui would do) is the optimal thing to do anyway. This would be possible with an idlified bsewavetool and a python or scheme binding.

Comment 5 Stefan Westerfeld 2007-07-04 20:33:35 UTC

Created attachment 91208 [details] [review]
Reworked shell iteration patch with old xinfo semantic

After our face2face discussion on xinfo efficiency, here is a new patch with the old xinfo semantic. Other than that, nothing changed as compared to the previous patch.

Comment 6 Tim Janik 2007-07-06 12:27:23 UTC

Comment on attachment 91208 [details] [review]
Reworked shell iteration patch with old xinfo semantic

>--- tools/bsewavetool.cc	(revision 4351)
>+++ tools/bsewavetool.cc	(working copy)
>+class WaveChunkKey {
>+  BseFloatIEEE754 m_osc_freq;
>+public:
>+  WaveChunkKey (const String& key_string)
>+  {
>+    // mark key as invalid
>+    m_osc_freq.v_float = -1;     
>+    if (key_string.size() != 7)
>+      return;  // invalid key
>+
>+    guint64 key_uint = 0;
>+    for (string::const_reverse_iterator si = key_string.rbegin(); si != key_string.rend(); si++)
>+      {
>+        key_uint *= 62;
>+        if (*si >= '0' && *si <= '9')
>+          key_uint += *si - '0';
>+        else if (*si >= 'A' && *si <= 'Z')
>+          key_uint += *si - 'A' + 10;
>+        else if (*si >= 'a' && *si <= 'z')
>+          key_uint += *si - 'a' + 10 + 26;
>+        else
>+          return; // invalid key
>+      }
>+
>+    const guint64 key_checksum = key_uint % 512;

this is effectively & 0x01ff (better be written like that, you'll see in a second)

>+    key_uint ^= key_checksum << 32LL;  // deobfuscate high bits with checksum
>+    const guint64 checksum = g_str_hash (string_printf ("%lld", key_uint - key_checksum).c_str()) % 512;

hm, using modulo 512 here aliases most of the upper hash bits. i think it'd be better deviding b ythe next prime, to preserve more of the entropy, i.e. use % 509 here.

>+    if (key_checksum != checksum)
>+      return; // invalid key
>+    key_uint >>= 9;
>+
>+    // decode float components in byte order independent way
>+    m_osc_freq.mpn.mantissa = key_uint;
>+    key_uint >>= 23;
>+    m_osc_freq.mpn.biased_exponent = key_uint;
>+    key_uint >>= 8;
>+    m_osc_freq.mpn.sign = key_uint;
>+  }
>+  WaveChunkKey (float osc_freq)
>+  {
>+    m_osc_freq.v_float = osc_freq;
>+  }
>+  String
>+  as_string() const
>+  {
>+    guint64 key_uint = 0;
>+
>+    // put float components to key in byte order independent way
>+    key_uint |= m_osc_freq.mpn.sign;            //  1 bit
>+    key_uint <<= 8;
>+    key_uint |= m_osc_freq.mpn.biased_exponent; //  8 bit
>+    key_uint <<= 23;
>+    key_uint |= m_osc_freq.mpn.mantissa;        // 23 bit
>+    key_uint <<= 9;                             // +9 bit checksum  
>+    const guint64 checksum = g_str_hash (string_printf ("%lld", key_uint).c_str()) % 512;

this will need to be % 509 as well.

>+    key_uint |= checksum;
>+    key_uint ^= (key_uint % 512) << 32LL;       // obfuscate high bits with checksum

and this should then be & 0x01ff;

>+
>+    string key_string;
>+    for (int i = 0; i < 7; i++) /* encode in custom base-62 format; 7 digits  <=>  2^41 < 62^7 */
>+      {
>+        guint64 digit = key_uint % 62;
>+        if (digit < 10)
>+          key_string += '0' + digit;
>+        else if (digit < (10 + 26))
>+          key_string += 'A' + digit - 10;
>+        else if (digit < (10 + 26 + 26))
>+          key_string += 'a' + digit - 10 - 26;
>+        else
>+          g_assert_not_reached();

i think this g_assert_not_reached(); can be removed. there *really* would have to be something wrong with the above %62 ;)

>+        key_uint /= 62;
>+      }
>+    g_assert (key_uint == 0);

hm, i'd have this left out as well...

>+      
>+    return key_string;
>+  }
>+  bool
>+  is_valid() const
>+  {
>+    return m_osc_freq.v_float >= 0;
>+  }
>+  float
>+  osc_freq() const
>+  {
>+    return m_osc_freq.v_float;
>+  }
>+};

hm, the above code looks good, but should really be under test, unfortunately we don't have a test facility for bsewavetool yet, and making writing a standard test for this is hard without factoring out key generation (which i think we don't have much use for at the moment). so i'd suggest to do this for now:
- add a "unit-test" command that can call a WaveChunkKey.unit_test() function
- WaveChunkKey.unit_test() should attempt to convert back/forth between most of the valid float range and strings
- call bsewavetool unit-test from Makefile.am:check:
- the unti-test command can be added after the funcitonality went into SVN, i.e. we don't need to block an even-larger-to-become patch on it.

>+static bool
>+parse_chunk_selection (char          **argv,
>+                       uint           &i,
>+                       uint            argc,
>+                       bool           &all_chunks,
>+                       vector<float>  &freq_list)
>+
>+{
>+  const gchar *str = NULL;
>+
>+  if (parse_bool_option (argv, i, "--all-chunks"))
>+    {
>+      all_chunks = true;
>+      return true;
>+    }
>+  else if (parse_str_option (argv, i, "-f", &str, argc))
>+    {
>+      freq_list.push_back (g_ascii_strtod (str, NULL));
>+      return true;
>+    }
>+  else if (parse_str_option (argv, i, "--chunk-key", &str, argc))
>+    {
>+      WaveChunkKey key (str);
>+      if (key.is_valid())
>+        {
>+          freq_list.push_back (key.osc_freq());
>+          return true;
>+        }
>+      else
>+        {
>+          sfi_error ("%s is not a valid bsewavetool key (keys need to be generated by list-chunks)", str);

if possible, errors should be formulated to show variable components (e.g. user input) at the end after a colon, i.e. here we can write more conscisely:
     sfi_error ("invalid bsewavetool chunk key: %s", str);
that keys need to be generated by bsewavetool can be easily inferred from this (if you had to keep "list-chunks" literally, you could use "invalid bsewavetool (list-chunks) chunk key: %s")


>+          exit (1);
>+        }
>+    }
>+  else if (parse_str_option (argv, i, "-m", &str, argc))
>+    {
>+      SfiNum num = g_ascii_strtoull (str, NULL, 10);
>+      gfloat osc_freq = 440.0 /* MIDI standard pitch */ * pow (BSE_2_POW_1_DIV_12, num - 69. /* MIDI kammer note */);
>+      freq_list.push_back (osc_freq);
>+      return true;
>+    }
>+  return false;
>+}
>+
>+static bool
>+verify_chunk_selection (const vector<float> &freq_list,
>+                        Wave                *wave)
>+{
>+  for (vector<float>::const_iterator fi = freq_list.begin(); fi != freq_list.end(); fi++)
>+    {
>+      if (!wave->lookup (*fi))
>+        {
>+          sfi_error ("wave chunk with oscillator frequency %.2f does not exist", *fi);

similar error format here:
     sfi_error ("failed to find wave chunk with oscillator frequency: %.2f", *fi);

>+          if (!continue_on_error)
>+            return false;
>+        }
>+    }
>+  return true;
>+}
>+
> static void
> wavetool_parse_args (int    *argc_p,
>                      char ***argv_p)
>@@ -964,7 +1113,7 @@ public:
>   void
>   blurb (bool bshort)
>   {
>-    g_print ("{-m=midi-note|-f=osc-freq|--all-chunks|--wave} key=[value] ...\n");
>+    g_print ("{-m=midi-note|-f=osc-freq|--chunk-key=key|--all-chunks|--wave} key=[value] ...\n");
>     if (bshort)
>       return;
>     g_print ("    Add, change or remove an XInfo string of a bsewave file.\n");
>@@ -975,6 +1124,7 @@ public:
>     g_print ("    Options:\n");
>     g_print ("    -f <osc-freq>       oscillator frequency to select a wave chunk\n");
>     g_print ("    -m <midi-note>      alternative way to specify oscillator frequency\n");
>+    g_print ("    --chunk-key <key>   select wave chunk using chunk key from list-chunks\n");
>     g_print ("    --all-chunks        apply XInfo modification to all chunks\n");
>     g_print ("    --wave              apply XInfo modifications to the wave itself\n");
>     /*       "**********1*********2*********3*********4*********5*********6*********7*********" */
>@@ -1050,6 +1200,33 @@ public:
>                 location = OSC_FREQ;
>               }
>           }
>+        else if (strcmp ("--chunk-key", arg) == 0 ||
>+                 strncmp ("--chunk-key", arg, 11) == 0)
>+          {
>+            const gchar *equal = arg + 11;
>+            const gchar *key_str = NULL;
>+            if (*equal == '=')              /* --chunk-key=Arg */
>+              key_str = equal + 1;
>+            else if (it + 1 != args.end())  /* --chunk-key Arg */
>+              {
>+                it++;
>+                key_str = *it;
>+              }
>+            if (key_str)
>+              {
>+                WaveChunkKey key (key_str);
>+                if (key.is_valid())
>+                  {
>+                    osc_freq = key.osc_freq();
>+                    location = OSC_FREQ;
>+                  }
>+                else
>+                  {
>+                    sfi_error ("%s is not a valid bsewavetool key (keys need to be generated by list-chunks)", key_str);

see above error msg  comments.

>+                    exit (1);
>+                  }
>+              }
>+          }
>         else /* XInfo string */
>           {
>             const gchar *equal = strchr (arg, '=');
>@@ -1114,7 +1291,7 @@ public:
>   void
>   blurb (bool bshort)
>   {
>-    g_print ("{-m=midi-note|-f=osc-freq|--all-chunks} [options]\n");
>+    g_print ("{-m=midi-note|-f=osc-freq|--chunk-key=key|--all-chunks} [options]\n");
>     if (bshort)
>       return;
>     g_print ("    Clip head and or tail of a wave chunk and produce fade-in ramps at the\n");
>@@ -1123,6 +1300,7 @@ public:
>     g_print ("    Options:\n");
>     g_print ("    -f <osc-freq>       oscillator frequency to select a wave chunk\n");
>     g_print ("    -m <midi-note>      alternative way to specify oscillator frequency\n");
>+    g_print ("    --chunk-key <key>   select wave chunk using chunk key from list-chunks\n");
>     g_print ("    --all-chunks        try to clip all chunks\n");
> #if 0
>     g_print ("    --config=\"s h t f p r\"\n");
>@@ -1152,23 +1330,8 @@ public:
>     for (guint i = 1; i < argc; i++)
>       {
>         const gchar *str = NULL;
>-        if (parse_bool_option (argv, i, "--all-chunks"))
>-          {
>-            all_chunks = true;
>-            seen_selection = true;
>-          }
>-        else if (parse_str_option (argv, i, "-f", &str, argc))
>-          {
>-            freq_list.push_back (g_ascii_strtod (str, NULL));
>-            seen_selection = true;
>-          }
>-        else if (parse_str_option (argv, i, "-m", &str, argc))
>-          {
>-            SfiNum num = g_ascii_strtoull (str, NULL, 10);
>-            gfloat osc_freq = 440.0 /* MIDI standard pitch */ * pow (BSE_2_POW_1_DIV_12, num - 69. /* MIDI kammer note */);
>-            freq_list.push_back (osc_freq);
>-            seen_selection = true;
>-          }
>+	if (parse_chunk_selection (argv, i, argc, all_chunks, freq_list))
>+          seen_selection = true;
>         else if (parse_str_option (argv, i, "-s", &str, argc))
>           threshold = g_ascii_strtod (str, NULL);
>         else if (parse_str_option (argv, i, "-h", &str, argc))
>@@ -1188,6 +1351,9 @@ public:
>   exec (Wave *wave)
>   {
>     sort (freq_list.begin(), freq_list.end());
>+    if (!verify_chunk_selection (freq_list, wave))
>+      exit (1);
>+
>     vector<list<WaveChunk>::iterator> deleted;
>     /* level clipping */
>     for (list<WaveChunk>::iterator it = wave->chunks.begin(); it != wave->chunks.end(); it++)
>@@ -1258,7 +1424,7 @@ public:
>   void
>   blurb (bool bshort)
>   {
>-    g_print ("{-m=midi-note|-f=osc-freq|--all-chunks} [options]\n");
>+    g_print ("{-m=midi-note|-f=osc-freq|--chunk-key=key|--all-chunks} [options]\n");
>     if (bshort)
>       return;
>     g_print ("    Normalize wave chunk. This is used to extend (or compress) the signal\n");
>@@ -1266,6 +1432,7 @@ public:
>     g_print ("    Options:\n");
>     g_print ("    -f <osc-freq>       oscillator frequency to select a wave chunk\n");
>     g_print ("    -m <midi-note>      alternative way to specify oscillator frequency\n");
>+    g_print ("    --chunk-key <key>   select wave chunk using chunk key from list-chunks\n");
>     g_print ("    --all-chunks        try to normalize all chunks\n");
>     /*       "**********1*********2*********3*********4*********5*********6*********7*********" */
>   }
>@@ -1276,24 +1443,8 @@ public:
>     bool seen_selection = false;
>     for (guint i = 1; i < argc; i++)
>       {
>-        const gchar *str = NULL;
>-        if (parse_bool_option (argv, i, "--all-chunks"))
>-          {
>-            all_chunks = true;
>-            seen_selection = true;
>-          }
>-        else if (parse_str_option (argv, i, "-f", &str, argc))
>-          {
>-            freq_list.push_back (g_ascii_strtod (str, NULL));
>-            seen_selection = true;
>-          }
>-        else if (parse_str_option (argv, i, "-m", &str, argc))
>-          {
>-            SfiNum num = g_ascii_strtoull (str, NULL, 10);
>-            gfloat osc_freq = 440.0 /* MIDI standard pitch */ * pow (BSE_2_POW_1_DIV_12, num - 69. /* MIDI kammer note */);
>-            freq_list.push_back (osc_freq);
>-            seen_selection = true;
>-          }
>+	if (parse_chunk_selection (argv, i, argc, all_chunks, freq_list))
>+	  seen_selection = true;
>       }
>     return !seen_selection ? 1 : 0; /* # args missing */
>   }
>@@ -1301,6 +1452,9 @@ public:
>   exec (Wave *wave)
>   {
>     sort (freq_list.begin(), freq_list.end());
>+    if (!verify_chunk_selection (freq_list, wave))
>+      exit (1);
>+

the combination of sfi_error, continue_on_error and exit()s on !verify_chunk_selection() doesn't make too much sense to me. note:
- if continue_on_error==TRUE, ignored condisiotions should be signalled with sfi_warning and not sfi_error (the current code might even need fixing in this regard)
- the exit() should always be next to the sfi_error() corresponding to it (works, if you only use sfi_warning otherwise).
- i think that way you can simplify verify_chunk_selection() specifically, by:
  * not return a value
  * use sfi_warning for continue on error
  * use sfi_error and exit() immediately otherwise

>     /* normalization */
>     for (list<WaveChunk>::iterator it = wave->chunks.begin(); it != wave->chunks.end(); it++)
>       if (all_chunks || wave->match (*it, freq_list))


>@@ -2259,6 +2392,50 @@ public:
>   }
> } cmd_export ("export");
> 
>+class ListChunks : public Command {
>+public:
>+  ListChunks (const char *command_name) :
>+    Command (command_name)
>+  {
>+  }
>+  void
>+  blurb (bool bshort)
>+  {
>+    g_print ("[options]\n");
>+    if (bshort)
>+      return;
>+    g_print ("    Prints a list of chunk keys of the chunks contained in the bsewave file.\n");
>+    g_print ("    A chunk key for a given chunk identifies the chunk uniquely and stays valid\n");
>+    g_print ("    if other chunks are inserted and deleted.\n");
>+    g_print ("    Here is a bash script to export all chunks (like export with --all-chunks):\n");

the following script should be indented by at least 2 spaces.

>+    g_print ("    for key in `bsewavetool list-chunks foo.bsewave`\n");
>+    g_print ("    do\n");

this is invalid shell syntax, you need this at the end of the for line " ; \\\n"
or better yet:
  g_print ("    for key in `bsewavetool list-chunks foo.bsewave` ; do\n");

>+    g_print ("      bsewavetool export foo.bsewave --chunk-key $key -x /tmp/foo-note-%%N.wav\n");

this should be something like "info" or so instead. so the user can paste the snippet and try it out immediately without significant ill effects (e.g. deleting chunks, or flooding his directory)

>+    g_print ("    done\n");
>+    /*       "**********1*********2*********3*********4*********5*********6*********7*********" */
>+  }
>+  guint
>+  parse_args (guint  argc,
>+              char **argv)
>+  {
>+    return 0; // no missing args
>+  }
>+  void
>+  exec (Wave *wave)
>+  {
>+    /* get the wave into storage order */
>+    wave->sort();
>+    for (list<WaveChunk>::iterator it = wave->chunks.begin(); it != wave->chunks.end(); it++)
>+      {
>+        WaveChunk     *chunk = &*it;
>+	WaveChunkKey   chunk_key (gsl_data_handle_osc_freq (chunk->dhandle));
>+
>+	g_print ("%s\n", chunk_key.as_string().c_str());
>+      }
>+  }
>+} cmd_list_chunks ("list-chunks");
>+
>+

thanks, the patch looks mostly good now ;)

Comment 7 Stefan Westerfeld 2007-07-06 16:21:43 UTC

(In reply to comment #6)
> (From update of attachment 91208 [details] [review] [edit])
> >+class ListChunks : public Command {
> >+public:
> >+  ListChunks (const char *command_name) :
> >+    Command (command_name)
> >+  {
> >+  }
> >+  void
> >+  blurb (bool bshort)
> >+  {
> >+    g_print ("[options]\n");
> >+    if (bshort)
> >+      return;
> >+    g_print ("    Prints a list of chunk keys of the chunks contained in the bsewave file.\n");
> >+    g_print ("    A chunk key for a given chunk identifies the chunk uniquely and stays valid\n");
> >+    g_print ("    if other chunks are inserted and deleted.\n");
> >+    g_print ("    Here is a bash script to export all chunks (like export with --all-chunks):\n");
> 
> the following script should be indented by at least 2 spaces.
> 
> >+    g_print ("    for key in `bsewavetool list-chunks foo.bsewave`\n");
> >+    g_print ("    do\n");
> 
> this is invalid shell syntax, you need this at the end of the for line " ;
> \\\n"

This is not true. I tested the script with bash, and after your remark I used dash as well, and it works fine. What is true is: for, do, the statements within the block and done need to be individual shell statements. But you can choose whether you separate them by ; or newlines.

The only context where you need to write

for i in a b c ; \
do ...

is within makefiles. But this is not because of the shell, but because within a Makefile, you need to put the instructions in one long "virtual" line, glued by backslashes.

> or better yet:
>   g_print ("    for key in `bsewavetool list-chunks foo.bsewave` ; do\n");
> 
> >+    g_print ("      bsewavetool export foo.bsewave --chunk-key $key -x /tmp/foo-note-%%N.wav\n");
> 
> this should be something like "info" or so instead. so the user can paste the
> snippet and try it out immediately without significant ill effects (e.g.
> deleting chunks, or flooding his directory)

Ok, I put a FIXME there. I think its bad style to apply a patch which uses an example which is not available because another patch has not been applied yet.

Comment 8 Stefan Westerfeld 2007-07-06 16:25:51 UTC

Created attachment 91312 [details] [review]
New patch for shell iteration

Chunk validation has been rewritten.
The checksum is % 509 now for better entropy.
Messages have been rewritten to show important info at the end.
A FIXME was added to indicate info should be used as example once available.

Comment 9 Tim Janik 2007-07-09 23:36:47 UTC

(In reply to comment #7)
> (In reply to comment #6)
> > >+    g_print ("    for key in `bsewavetool list-chunks foo.bsewave`\n");
> > >+    g_print ("    do\n");
> > 
> > this is invalid shell syntax, you need this at the end of the for line " ;
> > \\\n"
> 
> This is not true. I tested the script with bash, and after your remark I used
> dash as well, and it works fine.

hum, i only made that comment after a test that actually failed for me (with bash) but can't reproduce that now...

> The only context where you need to write
> 
> for i in a b c ; \
> do ...
> 
> is within makefiles.
[...]

we're not talking about makefiles here.

(In reply to comment #8)
> Created an attachment (id=91312) [edit]
> New patch for shell iteration

ok, looks good now, except for missing test and the FIXME.
one thing i'd have changed to make the code clearer though is to toggle the order of the <<=9 and str_hash in both cases, so the hashed string is the actuall key, not the key*512.

Comment 10 Stefan Westerfeld 2007-07-10 15:42:07 UTC

Created attachment 91551 [details] [review]
Unit test implementation for chunk keys

As suggested above, I implemented unit testing. I didn't do it as command, but added a bsewavetool option, as commands always need a .bsewave file. I didn't document the option, as it is more or less something internal we need for our Makefiles, and it could be changed at any point in time.

Comment 11 Tim Janik 2007-08-22 12:24:27 UTC

Comment on attachment 91551 [details] [review]
Unit test implementation for chunk keys

>Index: tools/bsewavetool.cc
>===================================================================
>--- tools/bsewavetool.cc	(revision 4353)
>+++ tools/bsewavetool.cc	(working copy)
>@@ -23,6 +23,7 @@
> #include <bse/gslvorbis-enc.h>
> #include <bse/gsldatahandle-vorbis.h>
> #include <bse/bseresamplerimpl.hh>
>+#include <birnet/birnettests.h>
> #include <stdlib.h>
> #include <errno.h>
> #include <string.h>
>@@ -395,6 +396,80 @@ public:
>   {
>     return m_osc_freq.v_float;
>   }
>+  static void
>+  unit_test()
>+  {
>+    struct T {
>+      const char *key;
>+      double      value;
>+    } decode_test[] = { 
>+      { "YJv6nCS", 1.000000000e-03 }, { "gcQWdrQ", 1.256637061e-03 }, { "dXeM2zG", 1.579136704e-03 }, 
>+      { "3BSjFXD", 1.984401708e-03 }, { "z2ncv68", 2.493672730e-03 }, { "3FgEbrN", 3.133641572e-03 }, 
>+      { "3udggXH", 3.937850137e-03 }, { "mwJOdP8", 4.948448424e-03 }, { "Wq8Df29", 6.218403687e-03 }, 
>+      { "1PCvWKP", 7.814276536e-03 }, { "6prz6TI", 9.819709503e-03 }, { "uAVbweQ", 1.233981089e-02 }, 
>+      { "WiAHl0H", 1.550666370e-02 }, { "QTga6LU", 1.948624831e-02 }, { "R5IamYE", 2.448714181e-02 }, 
>+      { "X8Hq3EV", 3.077144993e-02 }, { "bTysq86", 3.866854441e-02 }, { "O44r1F6", 4.859232602e-02 }, 
>+      { "6Xtdz9H", 6.106291778e-02 }, { "oTnEADX", 7.673392556e-02 }, { "lYHPo92", 9.642669472e-02 }, 
>+      { "JIN6b3N", 1.211733583e-01 }, { "YHQ64CY", 1.522709329e-01 }, { "IYvE8J2", 1.913492977e-01 }, 
>+      { "H3PGUBB", 2.404566191e-01 }, { "RPKNVNW", 3.021666993e-01 }, { "oKEsrTE", 3.797138730e-01 }, 
>+      { "Ib6py6X", 4.771625256e-01 }, { "vUBdWbW", 5.996201140e-01 }, { "NeXqt5A", 7.535048580e-01 }, 
>+      { "onjLlCN", 9.468821305e-01 }, { "KqKpl8b", 1.189887178e+00 }, { "sFUvRra", 1.495256327e+00 }, 
>+      { "iDMmswb", 1.878994517e+00 }, { "Qi98zf1", 2.361214148e+00 }, { "fwGfS3U", 2.967189208e+00 }, 
>+      { "3QiAhhc", 3.728679927e+00 }, { "KjOWeVX", 4.685597387e+00 }, { "qd1u3XX", 5.888095331e+00 }, 
>+      { "Sjny6ta", 7.399198815e+00 }, { "Y7H8x3b", 9.298107455e+00 }, { "OPQUQ92", 1.168434643e+01 }, 
>+      { "WDDAiXY", 1.468298276e+01 }, { "NW2gCXR", 1.845118031e+01 }, { "Z3jckdV", 2.318643701e+01 }, 
>+      { "vHT1zmF", 2.913693606e+01 }, { "B1nbexX", 3.661455372e+01 }, { "Rm9vGPI", 4.601120519e+01 }, 
>+      { "NRKoXwb", 5.781938568e+01 }, { "n3ckrYC", 7.265798291e+01 }, { "hePAxF9", 9.130471414e+01 }, 
>+      { "1tMGZWQ", 1.147368877e+02 }, { "DEDCuGT", 1.441826254e+02 }, { "WhpTjIX", 1.811852306e+02 }, 
>+      { "7Qfdr93", 2.276840758e+02 }, { "sDV9XtJ", 2.861162480e+02 }, { "9BhSRT5", 3.595442811e+02 }, 
>+      { "wVwxSBb", 4.518166688e+02 }, { "ZIun3lR", 5.677695710e+02 }, { "kfGMuca", 7.134802853e+02 }, 
>+      { "xm31dSY", 8.965857691e+02 }, { "3Jw8sqH", 1.126682906e+03 }, { "KV35vV7", 1.415831496e+03 }, 
>+      { "hgCCJWQ", 1.779186331e+03 }, { "EbldxII", 2.235791483e+03 }, { "I29fmr4", 2.809578439e+03 }, 
>+      { "kso3Dxa", 3.530620394e+03 }, { "v7xZPlV", 4.436708436e+03 }, { "rJErUN6", 5.575332252e+03 }, 
>+      { "5IleXBA", 7.006169138e+03 }, { "N9TbUyN", 8.804211797e+03 }, { "oChbIqW", 1.106369884e+04 }, 
>+      { "wezNP3V", 1.390305400e+04 }, { "RisqTI5", 1.747109292e+04 }, { "QuypC2Z", 2.195482287e+04 }, 
>+      { "7xalGea", 2.758924410e+04 }, { "XmOEsnX", 3.466966663e+04 }, { "hiKocjG", 4.356718800e+04 }, 
>+      { "70aj5NU", 5.474814310e+04 }, { "sOSufiT", 6.879854566e+04 }, { "xBZnuLY", 8.645480225e+04 }, 
>+      { "SoNwkyC", 1.086423086e+05 }, { "ul4qK0V", 1.365239515e+05 }, { "ka8lEEJ", 1.715610572e+05 }, 
>+      { "KE6hhDL", 2.155899828e+05 }, { "sP4x8p5", 2.709183625e+05 }, { NULL, 0 }
>+    };
>+#if 0
>+    for (double value = 0.001; value < 300000; value *= M_PI / 2.5) /* useful oscillator freqs */
>+      printf ("{ \"%s\", %-.9e }, \n", WaveChunkKey (value).as_string().c_str(), value);
>+#endif
>+    const double epsilon = 6e-8;    /* slightly more than BSE_FLOAT_EPSILON */

well, epsilon > BSE_FLOAT_EPSILON can easily be figured by a beginners mathematician.
the *why* is interesting, currently the comment tells me *nothing*.

>+    TSTART ("Decoding valid chunk keys");
>+    for (int i = 0; decode_test[i].key; i++)
>+      {
>+        WaveChunkKey key = WaveChunkKey (decode_test[i].key);
>+        TCHECK (key.is_valid());
>+        TCHECK (fabs ((decode_test[i].value - key.osc_freq()) / decode_test[i].value) < epsilon);
>+        if (i % 2 == 0)
>+          TICK();
>+      }
>+    TDONE();
>+    TSTART ("Encoding chunk keys");
>+    int i = 0;
>+    for (double value = 1e-20; value < 1e+20; value *= M_PI / 3.1)

for beast sources, we use PI from bseieee754.h.

>+      {
>+        WaveChunkKey encoded_key (value);
>+        WaveChunkKey decoded_key (encoded_key.as_string());
>+        TCHECK (decoded_key.is_valid());
>+        TCHECK (fabs ((value - decoded_key.osc_freq()) / value) < epsilon);
>+        if (i++ % 150 == 0)
>+          TICK();
>+      }
>+    TDONE();
>+    TSTART ("Invalid chunk keys");
>+    TASSERT (!WaveChunkKey ("badchr+").is_valid());
>+    TASSERT (!WaveChunkKey ("badlen").is_valid());
>+    TASSERT (!WaveChunkKey ("badlength").is_valid());
>+    TASSERT (!WaveChunkKey ("1234567").is_valid());
>+    TASSERT (!WaveChunkKey ("KE6hhDx").is_valid());
>+    TASSERT (!WaveChunkKey ("ka8leeJ").is_valid());
>+    TASSERT (!WaveChunkKey ("coolkey").is_valid());
>+    TDONE();
>+  }
> };
> 
> static bool
>@@ -503,6 +578,11 @@ wavetool_parse_args (int    *argc_p,
>           wavetool_print_version ();
>           exit (0);
>         }
>+      else if (parse_bool_option (argv, i, "--unit-test"))
>+        {
>+          WaveChunkKey::unit_test();
>+          exit (0);
>+        }
>       else if (parse_bool_option (argv, i, "-k"))
>         continue_on_error = true;
>       else if (parse_bool_option (argv, i, "-q"))
>@@ -2430,7 +2510,6 @@ public:
>   }
> } cmd_list_chunks ("list-chunks");
> 
>-
> /* TODO commands:
>  * bsewavetool.1 # need manual page
>  * bsewavetool merge <file.bsewave> <second.bsewave>

thanks a lot, the patch mostly looks good.

Comment 12 Stefan Westerfeld 2007-08-23 14:02:03 UTC

Created attachment 94189 [details] [review]
New unit test implementation for chunk keys

Changes
- M_PI -> PI
- improved comment describing the choice of the epsilon for the test

Comment 13 Stefan Westerfeld 2007-08-24 12:11:58 UTC

(In reply to comment #12)
> Created an attachment (id=94189) [edit]
> New unit test implementation for chunk keys
> 
> Changes
> - M_PI -> PI
> - improved comment describing the choice of the epsilon for the test
> 

By the way, maybe if I describe how I found the epsilon value, it will make review easier for you, because you'll have all facts I had when writing the justification.

First, I used only the difference between the value and expected value as assertion condition. But what maximum error could occor? I wanted to analyze how these differences looked, and gnuplotted them. What I found was that the higher the value, the higher the error. That was reasonable, since the float representation error grows with the value. So I divided by the value and looked at the plot again. Now the error was (more or less) evenly distributed between 0 and some maximum.

But what would that maximum be? I tried plotting a BSE_FLOAT_EPSILON line into the graph, and all values were below that. But then I started thinking: if this is so, why? And could there be a value which is exactly BSE_FLOAT_EPSILON or something slightly more than that...?

The answers I found to that question is in the comment, rationalizing the epsilon choice I made.

Comment 14 Tim Janik 2007-10-13 19:15:47 UTC

thanks, patch applied. please take a look at the source code history for the subsequent fixups and try to avoid them for future patches.