RoosRoos 1 Posted August 11 Posted August 11 I had to cut down audio files to below 14MB to make audio transcriptions work. The feature is wonderfull however this is a cumbersome activity and strange that it says files up to 25MB are possible. Will this be addressed, workable at least for 25MB and preferably higher? I did notice some glitches on the Dutch language however it saves time and concentration to not have to transcribe myself. 1
Level 5* gazumped 12,226 Posted August 14 Level 5* Posted August 14 Hi - If you haven't already, please send your comment to feedback@evernote.com - we're mostly other users here. Have you tried using lower quality audio to reduce file size? What device(s) are you using to record audio, or where are you sourcing the files?
RoosRoos 1 Posted August 15 Author Posted August 15 HI, I will send it to evernote - good idea. It was a zoom recording and I downsized it from mp4 to mp3 and then I had to cut it in pieces. The zoom recording was made by somebody else so don't know what device was used. 1
Level 5* gazumped 12,226 Posted August 15 Level 5* Posted August 15 Downsizing from MP4 to MP3 should allow you to reduce the quality of a file without losing content, though in extreme cases I guess the transcription might suffer if the quality is too low. It would be good if Evernote published some guidelines on what parameters are required for transcription - I asked AI for a list. Maybe Evernote could comment on the minimums required for some of these factors. Bit rate: Measures the amount of data processed per unit of time, usually expressed in kilobits per second (kbps). Higher bit rates generally mean better quality. Sample rate: The number of samples of audio taken per second, measured in Hertz (Hz) or kilohertz (kHz). Common rates are 44.1 kHz and 48 kHz. Bit depth: The number of bits used to represent each sample. Common depths are 16-bit and 24-bit. Higher bit depth allows for more dynamic range. File format: Such as WAV, MP3, FLAC, AAC, etc. Some formats are lossless while others are lossy. Dynamic range: The difference between the loudest and quietest parts of the recording. Signal-to-noise ratio (SNR): The level of desired signal compared to background noise. Frequency response: The range of frequencies the recording can reproduce accurately. Then, if you're using a third party file source, or recording via an external app, you could minimise the output file size with confidence. (I sent this list to feedback@ as a suggestion.)
Level 5 PinkElephant 9,015 Posted August 15 Level 5 Posted August 15 Language spoken, and dialects that are - sorry - out of luck … 🤷♂️ 1
l0nEr 8 Posted November 29 Posted November 29 Hi @gazumped Did you get the requirements from evernote? Ive been trying with various mp3 files, all below 12mb and about 1 hour long... all didnt work. not sure why. Thanks.
Jon/t 1,753 Posted November 29 Posted November 29 2 hours ago, l0nEr said: Hi @gazumped Did you get the requirements from evernote? Ive been trying with various mp3 files, all below 12mb and about 1 hour long... all didnt work. not sure why. Thanks. Are you sure it's a proper mp3 as 12MB seems a bit small for an hour. Spoken word exported at 128kps usually works out at around 1MB per minute ISH. The limit for Evernote is 25mb so around 25 mins.
l0nEr 8 Posted November 29 Posted November 29 7 hours ago, Jon/t said: Are you sure it's a proper mp3 as 12MB seems a bit small for an hour. Spoken word exported at 128kps usually works out at around 1MB per minute ISH. The limit for Evernote is 25mb so around 25 mins. i used audacity to shrink downsize the file to about variable bitrate around 32kbps sample rate i did 16 khz tried to make it mono i had no issues transcribing that file on whisper v3 (run locally) though or assembly.ai https://dev.to/mxro/optimise-openai-whisper-api-audio-format-sampling-rate-and-quality-29fj?comments_sort=top
Jon/t 1,753 Posted November 29 Posted November 29 11 minutes ago, l0nEr said: i used audacity to shrink downsize the file to about variable bitrate around 32kbps sample rate i did 16 khz tried to make it mono i had no issues transcribing that file on whisper v3 (run locally) though or assembly.ai https://dev.to/mxro/optimise-openai-whisper-api-audio-format-sampling-rate-and-quality-29fj?comments_sort=top That's very odd as I think, but I may be wrong, they use whisper ai for the transcription! Send in a support ticket and reply to the auto email with your logs (help menu) and if you can, a copy of the audio.
Level 5* gazumped 12,226 Posted November 29 Level 5* Posted November 29 12 hours ago, l0nEr said: Did you get the requirements from evernote? Hi. No - I used feedback to suggest the idea, but that doesn't prompt them to comment. As this is a third-party service, getting some 'official' feedback might take a while. Let us know what sort of a reaction you get to a support request!
EddieO23 0 Posted December 1 Posted December 1 You guys arent the only one having transcribing issues. im trying to transcribe .mp4 files under 25mb and its not even showing up. on desktop or mobile.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now