Olewave.com provides extensive customizable multimodal datasets in various languages and topics, amounting to millions of hours. Our pricing is 1/10th of traditional vendors, with 5x data efficacy. We also offer a robust and customizable data collection and cleaning pipeline than can run on your site or cloud. Our data service and pipeline enable you to train top-notch GPT-4o-style multimodal or Generative AI models in-house. Specializing in data since 2015, we ensure quality and reliability.See the introduction of Olewave's large-scale conversational speech dataset with speaker labels and high quality transcriptions. The data samples are also presented:https://www.linkedin.com/pulse/olewave-large-scaled-convesational-speech-dataset-olewave-927ic/Links to Olewave's Youtube channel -- mostly paper reading on Speech and NLPhttps://www.youtube.com/channel/UCm99ZwZ1bODHskkHwMOue0wI have a mirror Youtube channel for my Chinese audience (most of them cannot access Youtube):https://space.bilibili.com/12477508The organizer of Speech and NLP Meetup Grouphttps://www.meetup.com/speech-and-language-technology-meetup-group* Any institute who is interested in sponsoring/hosting an offline meetup with Speech and NLP people in Bay Area, feel free to contact me.
Listed skills include Machine Learning, Signal Processing, Pattern Recognition, Speech Recognition, and 7 others.