Acoustic levitation is achieved via phase-synchronized arrays of ultrasonic transducers (>40 kHz, here: 40.400 Hz), generating structured standing wave fields through controlled interference. These fields produce stable acoustic pressure nodes capable of trapping and stabilizing low-mass particles (fluorescent microplastics) in mid-air. Continuous phase modulation enables dynamic deformation of the nodal topology, allowing for high-resolution spatial positioning and micro-adjustments in real time.
Note: Acoustic levitation is a physical phenomenon in which sound waves are used to suspend objects in a fluid medium—typically air—by counteracting the force of gravity. It relies on the formation of a standing wave, created when two coherent sound waves of the same frequency and amplitude propagate in opposite directions, usually generated by an ultrasonic transducer and reflected back by a rigid surface. The interference between these waves produces a spatially fixed pattern of nodes, where pressure variation is minimal, and antinodes, where it is maximal. Within this structured pressure field, small objects experience a net force known as the acoustic radiation force, which arises from the interaction between the object and the surrounding oscillating pressure gradients. This force is not instantaneous but results from a time-averaged effect of the sound wave scattering off the object. Depending on the object’s density and compressibility relative to the surrounding medium, it is driven toward either nodes or antinodes, where stable equilibrium positions exist. When the upward acoustic radiation force exactly balances the downward gravitational force, the object remains suspended in midair. In practice, ultrasonic frequencies—typically around 40 kHz—are used because their short wavelengths produce closely spaced nodes and sufficiently strong gradients to generate measurable forces on small masses. The underlying physics is governed by wave mechanics and fluid dynamics, and can be described quantitatively using models such as Gor’kov’s potential, which relates the force on a particle to the gradient of acoustic energy density in the field. Although the forces involved are relatively small, they are sufficient to levitate droplets, small particles, or lightweight solids, making acoustic levitation a precise, contactless method for manipulating matter using the structured energy of sound.
Robotic kinematics: The levitation system is synchronized with three robotic arms by UFactory, operating in joint-velocity control mode. Motion generation is based on inverse kinematics combined with continuous velocity profiling (including S-curve easing, slew-rate limiting, and low-pass filtering), translating discrete targets into physically coherent, jerk-minimized trajectories. This is implemented as a layered real-time control stack with feedback stabilization and soft-limit recovery mechanisms .
Data-driven modulation: The system continuously ingests environmental and contextual data streams: meteorological parameters (PM2.5, PM10, NO₂, SO₂, O₃, CO, humidity, temperature, atmospheric pressure, and wind speed) via API, alongside visitor tracking through an infrared-based vision system (YOLO). These heterogeneous inputs are algorithmically mapped onto motion parameters—primarily velocity, acceleration, and oscillation frequency—resulting in a continuous reparameterization of the system’s kinetic behavior in response to external conditions.
Sonic synthesis: In parallel, a machine learning–based voice synthesis engine (ONNX-based - Kokoro) transforms environmental data into a generative acoustic layer. The data is not only processed semantically (TTS), but also spectrally: air quality indices and atmospheric fluctuations directly modulate pitch, timbre, formant structure, and amplitude dynamics. The resulting output spans from low-frequency, diffuse murmurs to high-intensity, noise-like emissions, intermittently coalescing into quasi-human vocal textures. The implementation further incorporates dynamic audio processing such as echo stacking and frequency-dependent filtering driven by AQI values .
Perceptual feedback loop: A vision-based tracking system (YOLO, infrared/night vision) detects visitor presence, position, and movement density. This data is recursively fed back into the system, modulating both robotic motion and sonic output. The result is a closed-loop, cybernetic architecture characterized by bidirectional coupling between environment, machine behavior, and perceptual response.