Tagline

The Studio of Eric Valosin

Saturday, October 15, 2016

OOOA Part TwOOOA

The performance date for our "Object Oriented Ontological Actions" crept nearer as I continued to refine my computer vision speech synthesizer project from my last post, The Word That Speaks Itself (Objectless Oriented Program).

I adjusted the variables to locate the hand positions relative to the person's body rather than to the screen itself, which helped account for a variety of user positions near or far from the camera. I also added in the complete array of spoken code, and tweaked the timing of the voices for a closer unison.

Unfortunately I had to manually input all 1478 "words," but I did at least devise a way to automate the process of fitting all those entries into the necessary code blocks. By listing all the items in separate columns of a Numbers spreadsheet, I could copy and paste each element: "speech[__] = '__';" where the first space could be algorithmically numbered and the second space was the respective word in the array. Then I used a formula to merge all the rows into a single column, so that I could copy and paste the whole thing into Pages where I could systematically use Find/Replace to remove the extra spaces and replace the quote marks with straight quotes. Then I could copy and paste into Processing.

Our first performance...


was slated for Gallery Aferro in mid May, and would be followed by a second performance in Brooklyn in June. I met with five of the other participating artists to discuss the framework around the John Zorn COBRA inspired game rules that would dictate the performance.

Everyone's sound objects would be stationed around the gallery, and artists and viewers alike would be able to interact with them. We also contracted a jazz trio as a "house band," which (unbeknownst to them) would play standards and improvise in conjunction with the chaotic cacophony of all the sound objects joining the composition. 

A big, occultish wheel-of-fortune style wheel would then dictate musical rules, including time signature changes, volume or tempo changes, switching instruments, offering a vocalized "prayer," playing blindly by wearing an oversized mask, using a toy sword to cut an instrument out of the music for a time, simulating your instrument's sound a cappella, or manipulating a wooden paddle in whatever less-than-sadistic way the user saw fit. 

These of course became implausibly precise rules that most of the atonal instruments were incapable of following in the first place, and it added to the anarchic anti-structure of the whole ontological experiment.

setting up the wheel and the tables 
My setup, a mac mini, keyboard, and kinect sensor packed into the cabinet of an amp with a monitor on top. The white tape denotes optimal range for interaction.

Some of the other projects included a delay-looped feedback pendulum, contact mics submerged in water with various crystals and rocks, a telegraph-turned-alarm bell, salvaged broken electronics out of which sound is coaxed (including a theramin!), and more.




One of the other artists interacting with my project

Molto-Allegro!

Fortissimo!

Heidi Lorenz-Wettach Hussa's repurposed theramin and other broken electronics



Jamming

There were still several challenges to troubleshoot. The biggest difficulty was programming it to account for all the unexpected variations in environment and interaction. 




I was thankful my program apparently had the flexibility to recognize many unorthodox body shapes, but it often had a hard time detecting users, especially with multiple people in the frame. If their hands went behind their back the tempo would sometimes get stuck in super slow motion, essentially freezing the program. It would also get stuck after loosing a tracked user, not realizing it was no longer tracking them. Often times viewers didn't know if they were being tracked or not. And of course I wasn't thrilled by the shrunken visuals, but I'd need to adjust many of my equations before being able to size up the screen.

All of this thankfully went hand in hand with the provisional, experimental nature of the show. 

Many kinks to work out before...

Off to Brooklyn!

In June we were invited to reprise our performance as a part of the "Summit: Nature, Technology, Self" conference led by the High Focus Institute in Brooklyn's Kilroy Metal Ceiling.

The venue was a gloriously defunct old warehouse, literally falling apart at the seams. It housed the projects of several other collectives and individuals, including prototypes of Terreform One's sustainable food shelters, which I got to see in person just days before I saw news articles popping up about it online!

Many groups held performances of one sort or another, including a DJ, a playwright, and an immersive projection/performance installation that involved an almost alien-mystic enlightenment "interview." It was tongue and cheek and utterly implausible and yet it was maybe the most impactful and stirring performance I've ever engaged with.





Sitting in the Queue to "interview" with the High Performance Institute.

After helping you craft your "resume," (by infusing the objects on the table with your own personal history and ideologies) I'm taken back behind the curtain to speak with his female counterpart. The psychedelic projection landscape enveloping you and the strangely empathic line of questioning slowly turns into a trancelike, swaying dance that may have changed my life. Not even being hyperbolic.

 This time, we substituted our Jazz trio with a Progressive Noise Band out of Philly. The effect was extraordinarily different!




 I reworked my project a bit, building in some textual user feedback on the tracking process to help confused viewers hang in there while it calibrates, and eliminated the perpetual slow motion glitch. Most importantly though, I got it to automatically abort and restart the tracking process when a user left the screen and it thought they were still there. I also built in a fail-safe nuclear option, in which one stroke of the keyboard could be used to reboot the whole program.



And it turned out I'd need it. Though the open space behind my piece's station caused some confusion in user tracking, it worked brilliantly. ...Until our noise band lived up to its name and cranked it up to 12 or 13 (11 is so 80's). 

I still don't know exactly what went wrong, but the leading theory is that the sound vibrations were so intense it disrupted the computer's hard drive and crashed the whole thing. Not just the program, the whole computer. Nothing would do anything. Either that or there were so many amps and instruments producing wonky electromagnetic fields that it too messed with the hard drive. In any case, I spent a solid portion of the performance fighting with the technology and becoming a bystander. At least it had its moment in the sun beforehand, and it added to the anarchic feel of the whole endeavor.

I guess you could say it was a very Object Oriented moment for me, in the truest and most Heideggerian sense, as the tool's malfunctions betrayed the tool to me in all it's glorious Ontological dysfunction. Mission accomplished?

Lots more to figure out before there's ever a third iteration of this project, but until then, if you're interested, here's a link to my full processing code. Feel free to excerpt and adapt, just give credit where it's due.

This was a really fun challenge, and afforded some experiences I likely wouldn't otherwise get, working with the Oculus Collaborative and infiltrating Brooklyn's hipster-grunge experimental music scene. All in the name of Ontology.



<< PREVIOUS POST                           


"Oooh, Ahhh, OOOA:" The Word that Speaks Itself (Objectless Oriented Program)

Sound is a fundamental part of the mystical experience, from the cosmic Om to the subatomic vibrations of matter, to the Logos of Biblical spoken creation, to the oral tradition of Quranic verse, to the noetic unity of a band finding the pocket.

I'd been trying to figure out how to incorporate sound into my work for years, but could never quite find a way that didn't seem contrived or precocious.

However, this past spring I was invited by Oculus Art Collaborative to join them in an experimental sound exhibition at Gallery Aferro, and this seemed the perfect challenge to dig in and figure this out.

Object Oriented Ontological Action (O.O.O.A.), an experimental, participatory exhibition of artist-made sound objects.
The call was for "sound objects" that viewers could interact with to collaboratively create music within an "anarchic interactive public performance inspired by John Zorn’s COBRA and object-oriented ontological philosophies by Yale architecture professor Mark Foster Gage."

I thought about my usual wont toward immateriality, and what sort of postmodern, relation, mystical sound object I might be able to contribute.

Object Oriented Objectlessness


Phenomenologically speaking, Object Oriented Ontology aims to redirect one's attention away from function back to the tool itself, which as Martin Heidegger points out in Being and Time, is usually made invisible, veiled behind its function until such a time as it breaks and stops functioning, and we stop taking the object's objectness for granted. In that sense, this exhibition was about the anarchic creation of function flowing out of the exploration of the object as object. Flowing out of Zorn's practice, it also lays bare the underlying arbitrariness of rule structures and guidelines, inevitably denying order while attempting to follow order, producing a new order of its own in the process. Our attention is drawn to the invisible object, and to the meta-framework we create around it.

It therefore also has something to do with making "invisible form" visible. Well, this seemed like a perfectly mystical place to start.

But it just so happens there is also a technological parallel. There's a type of programming in which mathematical functions get bundled up into "objects" that carry out those functions (likewise forcing visibility to shift from function to object). These are aptly named Object Oriented Programming Languages. Java is one of these languages, which just happens to be the basis of the Processing development environment, the very tool I've used for nearly all of my interactive new media projects.

The road seemed clearly laid before me at this point. I'd use an Object Oriented Program to create and Objectless Object that would be used in accordance with an Object Oriented Ontological experience!

My Objectless Sound Object


My proposal to Oculus was thus:

The Word that Speaks Itself (Objectless Oriented Program) is an objectless object that uses computer vision to track the viewer's hand motions within a defined space to control playback of the "object" chanting its own code. Two hands allow for two voices in harmony, and the motion/placement of the hands controls pitch, volume, and tempo. In that way the viewer becomes the conductor and the instrument. Fittingly, the project is created using Java, which is an "object oriented programming language." 
I'm interested, as you know, in the thingness of nothingness, and interactivity within a digitally mediated meditation, and I think this project would fit very well within the theme of the show. The title is taken from a Meister Eckhart sermon reflecting on the paradox of God being the Logos - the "Word" - and yet every word by nature needing to have been spoken. Thus God becomes the word that speaks itself.
To accomplish this, I essentially broke down the project into a set of variables tied to intuitive musical conducting motions. The position of the hands on the Y axis (high or low) would be tied to pitch, the X axis (out to the sides or near the body) to volume, Z axis (stretched out in front or back near the body) to the tempo. 

The result would be like some sort of full body, anthropomorphized, digital theramin that chants its own code like a self-generative, polyphonic monk!

Text-To-Speech

The first challenge was to get this thing to talk in the first place. If worst came to worst I knew I could make recordings of every necessary phoneme and then use the variable to trigger playback of a bazillion little .mp3 clips. But considering every device I own has some sort of text-to-speech accessibility option, I thought there must be a better way.

That's when I stumbled upon the FreeTTS Library. A developer named Guru (who's website is currently not available) wrote a convenient Processing wrapper for this Java library that would allow me to have a synthesized voice read words I input into my Processing sketch, and I could even change the variable for pitch.

Here's a basic example sketch that uses threading to create two voice harmony:



Tuning the Invisible Synthesizer

The problem quickly became apparent when I tried to map the y axis to pitch frequencies within a given vocal range.  I knew A440 to be concert tuning, so I googled from there. However, though a musician all my life, it somehow never registered that there are multiple tunings for our western 12 tone scale. As I tried to mathematically divide the range of frequencies I found into equal spaces, I found they did not land on the true pitches, and octaves didn't lineup. The problem looked something like this:


I had to shift away from using what turned out to be the Pythagorean scale, and instead move to an Equal Tempered scale so that the pitch interval ratios would all be equal.

Here's a site that describes the discrepancy pretty elegantly, and here perhaps more practically, and here's a good history of how this all developed in the first place. If you're having fun geeking out on this stuff like I was, follow the links in the first site and check out the books he recommends.

Once I was able to get the pitches mapped to the screen, I was able to produce prototypes like this: 



With a little more refining, I replaced the spoken filler text with excerpts of my program's own source code, and got some promising results.



Waxing "Poet"-ic

Eventually I found the Gurulib wrapper to be too confining. What it possesses in simplicity, it lacks in versatility. It lacked controls for volume, tempo, voices, and all the coding happened behind the scenes somewhere in the library's files, unable to be easily manipulated. FreeTTS's API documentation showed that all this should be possible, but when I'd attempt to access those functions through the wrapper it was a dead end.

So I took to the internet and discovered this dream come true buried in the comments of a forum thread: A homemade wrapper by the poster Kos that puts the heavy lifting in the hands of a class called "Basnik."

Basnik - as I'm sure we all know - is the Slovak word for "Poet."

It was much more transparent in it's setup, and allowed far more utilization of the FreeTTS's capabilities. I was able to dig into the FreeTTS API documentation and give the class a major facelift to access all the variables and create all the flexibility I needed.

Click here for my revised Basnik class. (code examples included)

That allowed me to create more sophisticated prototypes like this one:


With more tweaking to the aesthetics and user interface, I was ready to bring this to the public. Continue to my next post for the unveiling!



Processing and FreeTTS speech synthesizer - voice, pitch, tempo, volume control

This summer I participated in an experimental sound art exhibition, and in my quest to create a computer vision based Text-to-Speech sketch in Processing, I discovered there are really no good options out there.

The FreeTTS java library is more than sufficient in theory, but I could not find a Processing friendly wrapper that actually takes advantage of all FreeTTS has to offer.

Luckily, 

deep in the recesses of the Processing forum, the user Kof offered this "messy class called Basnik." in response to another user baffled by the same problems.

Kudos to Kof and his brilliant tutorial. He said to "modify as you need," so I did:
I'm not a polished full-stack programmer by any stretch, but this should get the job done.

Here's a cleaned up, beefed up version of his Basnik class that allows for dynamic control of the voice, pitch, volume, tempo, range, transposition, and more.

  1. Follow his advice and download the FreeTTS speech synthesizer

  2. Put all the .jar files you download inside a folder called "CODE" within your Processing sketch's folder

  3. Open a separate tab in your Processing Sketch and paste in this class:
  4. In your main code, use this as a basic template:
  5. create as many instances of Basnik as you want - you can thread them together to create polyphonic effects, use variables for dynamic control of the voices, and iterate through string arrays ("String[]") of words so you can change those controls mid sentence.

I hope you find this useful. Feel free to use and adapt this and make it even better.