PyAudio - Konvertiere stream.read in int, um die Amplitude zu erhalten

Ich versuche Audio aufzunehmen und gleichzeitig die Amplitude des aufgenommenen Signals zu drucken. Also speichere ich alle Daten in stream.read. Aber wenn ich versuche, sie zu drucken, habe ich eine Folge von Bytes und keine ganzen Zahlen. Ich würde gerne wissen, wie man diese Zeichen umwandelt, um die Amplitude zu erhalten.PyAudio - Konvertiere stream.read in int, um die Amplitude zu erhalten

Dies ist mein Code:

import pyaudio 
import wave 

CHUNK = 1024 
FORMAT = pyaudio.paInt16 
CHANNELS = 1 
RATE = 44100 
RECORD_SECONDS = 5 
WAVE_OUTPUT_FILENAME = "output.wav" 

p = pyaudio.PyAudio() 

stream = p.open(format=FORMAT, 
       channels=CHANNELS, 
       rate=RATE, 
       input=True, 
       frames_per_buffer=CHUNK) 

print("* recording") 

frames = [] 

for i in range(0, int(RATE/CHUNK * RECORD_SECONDS)): 
    data = stream.read(CHUNK) 
    frames.append(data) # 2 bytes(16 bits) per channel 

print("* done recording") 

stream.stop_stream() 
stream.close() 
p.terminate() 

for data in frames: 
    print(data)

Und das ist, was ich erhalten:

 ����# ���� 
      
!$ 
      

       �� ���� �������������������������� 
      ������ �� ��           
�� 

    �� ������ ���������������������������� 
          ��  
            ����

% () , . % #

Quelle

2016-04-04 Utopia

Sie können sich sicherlich begeistern durch den folgenden Code:

#!/usr/bin/python 

# open a microphone in pyAudio and listen for taps 

import pyaudio 
import struct 
import math 

INITIAL_TAP_THRESHOLD = 0.010 
FORMAT = pyaudio.paInt16 
SHORT_NORMALIZE = (1.0/32768.0) 
CHANNELS = 2 
RATE = 44100 
INPUT_BLOCK_TIME = 0.05 
INPUT_FRAMES_PER_BLOCK = int(RATE*INPUT_BLOCK_TIME) 
# if we get this many noisy blocks in a row, increase the threshold 
OVERSENSITIVE = 15.0/INPUT_BLOCK_TIME      
# if we get this many quiet blocks in a row, decrease the threshold 
UNDERSENSITIVE = 120.0/INPUT_BLOCK_TIME 
# if the noise was longer than this many blocks, it's not a 'tap' 
MAX_TAP_BLOCKS = 0.15/INPUT_BLOCK_TIME 

def get_rms(block): 
    # RMS amplitude is defined as the square root of the 
    # mean over time of the square of the amplitude. 
    # so we need to convert this string of bytes into 
    # a string of 16-bit samples... 

# we will get one short out for each 
# two chars in the string. 
count = len(block)/2 
format = "%dh"%(count) 
shorts = struct.unpack(format, block) 

# iterate over the block. 
    sum_squares = 0.0 
    for sample in shorts: 
     # sample is a signed short in +/- 32768. 
     # normalize it to 1.0 
     n = sample * SHORT_NORMALIZE 
     sum_squares += n*n 

    return math.sqrt(sum_squares/count) 

class TapTester(object): 
    def __init__(self): 
     self.pa = pyaudio.PyAudio() 
     self.stream = self.open_mic_stream() 
     self.tap_threshold = INITIAL_TAP_THRESHOLD 
     self.noisycount = MAX_TAP_BLOCKS+1 
     self.quietcount = 0 
     self.errorcount = 0 

    def stop(self): 
     self.stream.close() 

    def find_input_device(self): 
     device_index = None    
     for i in range(self.pa.get_device_count()):  
      devinfo = self.pa.get_device_info_by_index(i) 
      print("Device %d: %s"%(i,devinfo["name"])) 

      for keyword in ["mic","input"]: 
       if keyword in devinfo["name"].lower(): 
        print("Found an input: device %d - %s"%  (i,devinfo["name"])) 
        device_index = i 
        return device_index 

    if device_index == None: 
     print("No preferred input found; using default input device.") 

    return device_index 

def open_mic_stream(self): 
    device_index = self.find_input_device() 

    stream = self.pa.open( format = FORMAT, 
          channels = CHANNELS, 
          rate = RATE, 
          input = True, 
          input_device_index = device_index, 
          frames_per_buffer = INPUT_FRAMES_PER_BLOCK) 

    return stream 

def tapDetected(self): 
    print "Tap!" 

def listen(self): 
    try: 
     block = self.stream.read(INPUT_FRAMES_PER_BLOCK) 
    except IOError, e: 
     # dammit. 
     self.errorcount += 1 
     print("(%d) Error recording: %s"%(self.errorcount,e)) 
     self.noisycount = 1 
     return 

    amplitude = get_rms(block) 
    if amplitude > self.tap_threshold: 
     # noisy block 
     self.quietcount = 0 
     self.noisycount += 1 
     if self.noisycount > OVERSENSITIVE: 
      # turn down the sensitivity 
      self.tap_threshold *= 1.1 
    else:    
     # quiet block. 

     if 1 <= self.noisycount <= MAX_TAP_BLOCKS: 
      self.tapDetected() 
     self.noisycount = 0 
     self.quietcount += 1 
     if self.quietcount > UNDERSENSITIVE: 
      # turn up the sensitivity 
      self.tap_threshold *= 0.9 

if __name__ == "__main__": 
tt = TapTester() 

for i in range(1000): 
    tt.listen()

Es kommen von diesem Posten : [Detect tap with pyaudio from live mic

Sie können es leicht anpassen, um den RMS in eine Tabelle zu legen und die Tabelle zu plotten.

Quelle

2016-04-04 22:16:23 FLCcrakers

Danke für Ihre Antwort. Ich habe gerade die Klasse get_rms hinzugefügt und Werte in einer Liste gespeichert und alles ist in Ordnung. Ich bekomme eine Liste von Amplituden, die zunehmen oder abnehmen, ob ich rede oder nicht. – Utopia

Schön. Wenn Sie jetzt wirklich planen wollen, empfehle ich Ihnen, pyqt zu verwenden. – FLCcrakers

PyAudio gibt Ihnen binär codierte Audio-Frames als Bytes in einem Zeichenfolge. Siehe die Antwort auf diese Frage, wie eine menschenlesbare Darstellung Ihrer Frames drucken:

Get an audio sample as float number from pyaudio-stream

Quelle

2016-04-04 22:16:17 NoThatIsTeal

Vielen Dank für Ihre Antwort. Ich habe nur die Zeile "decodiert = numpy.fromstring (Daten, 'Float32');" in meiner for-Schleife, aber das Ergebnis ist nicht abgeschlossen. Ich erhielt eine Liste von sehr kleinen Zahlen wie: 3.67348991e-40 6.42851276e-40 3.67355998e-40 6.42868091e-40 2.75502285e-40 1.10201895e-39nan 4.59204105e-40 1.19389508e-39 1.37756747e-39 – Utopia

Sie müssen das richtige Format für Ihre Daten verwenden. probiere 'decided = numpy.fromstring (data, dtype = numpy.int16)'. Ich schlage 'numpy.int16' vor, da Sie den Stream als aus 16-Bit-Ganzzahl-Samples bestehend definiert haben. Wenn Sie verschiedene Beispielformate ausprobieren möchten, hier ist die Liste der von numpy unterstützt: http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.dtype.html#numpy .dtype – NoThatIsTeal

Ich glaube, Sie könnten dies tun

data = stream.read(CHUNK) 
for each in data: 
    print(each)

Quelle

2017-09-09 11:43:19

PyAudio - Konvertiere stream.read in int, um die Amplitude zu erhalten

Antwort

Verwandte Themen