Warning: file_put_contents(/srv/users/serverpilot/apps/bitupdateus/public/wp-content/plugins/bulk-post-0.4-1/cache/sessions//3kYQ2gdQBvHuiY7Rbs69Ul8IRHI3e5ZdqMAzJhTJ): failed to open stream: No space left on device in /srv/users/serverpilot/apps/bitupdateus/public/wp-content/plugins/stupidpie-1.8.3-1/vendor/illuminate/filesystem/Filesystem.php on line 122
  Generative Adversarial Networks are now also inventing clay | Bit Updates
Home » bitcoin updates » Generative Adversarial Networks are now also inventing clay

Generative Adversarial Networks are now also inventing clay

Saturday, February 17th, 2018 | bitcoin updates

          
    
    
    (Picture: Donahue et al.)
                
            
             Researchers at the University of San Diego have successfully adapted an idea that is popular for image production to produce sound. A team of opposing neural networks learns to pronounce numbers with little data.
            

        

        A team of one musician and two computer scientists from the University of San Diego has produced short audio samples with two Generative Adversarial Networks (GAN). The algorithm learns to pronounce digits without the need for training data that determines which digit has just been spoken. In the future, this could allow synthetic speakers to rely on existing material, rather than specifically engaging in training data. The audio examples for the paper show: The numbers are understandable to humans, but still sound very synthetic.
In GANs, two neural networks work against each other. The first neural network, the generator, generates records, in this case audio with 16384 samples (about 1 second at 16kHz). The second network, the discriminator, tries to decide if any examples he sees are generated by the generator or are real examples from the training data. In practice, there are often no gradient usable for training, so it often makes sense to minimize the Wasserstein 1 distance between the generated data and the training examples instead. This variant is called WGAN.

1-dimensional Convolutions

  

          
          Instead of 5×5 as with DCGAN, WaveGAN uses 25×1 kernels for the convolution operations.
        
          (Picture: Donahue et al.)

        
    
The researchers wanted to transfer the basic idea of ​​the DCGAN imaging network to audio and replaced the 5×5 convolution kernel with a one-dimensional 25×1 kernel. In addition, they doubled the step size to better respond to periodic structures that are much more common in audio than in pictures.
They call the resulting network WaveGAN and compare it in their paper with a DCGAN variant that generates spectrograms (SpecGAN). Both variants produce audio examples that perform similarly well, according to the researchers' predictable quality criteria. The examples of WaveGAN sound much better for humans.
Researchers want to increase their interest in using GANs to generate sound. You also hope for better methods for automatically evaluating the results.

(JME)

      

Related

Intel Gemini Lake for Min

              Intel Gemini Lake for mini PCs with twice HDMI 2.0       

Euro on Sunday mailbox: T

by Stefan Rullktter, uro am Sonntag Because I expect a high reimbursement from

MasterCard is working on

MasterCard's new patent application shows that the credit card giant is considering blockchain

macOS 10.13.2: Second upd

              macOS 10.13.2: Second update for High Sierra fixes bugs and stuffs