Sync Google Drives Files + Transcription

Sync a Google Drive folder into Breyta. Audio and video are auto-transcribed and all files are saved to a private Synced Drive files folder with original names.

Created by Chris Moen • Version 80 • 21 steps

Centralise and transcribe your files automatically

Manage your media and documents without the manual overhead. This workflow monitors a specific Google Drive folder and syncs its contents directly into Breyta. It identifies any audio or video files and sends them to AssemblyAI for transcription. Once processed, your files are stored in a private folder with their original filenames, ensuring your workspace remains organised and searchable.

Integrated tech stack

This template connects three essential tools to handle your file management. It uses Google Drive for storage and file retrieval, AssemblyAI for accurate speech-to-text processing, and Breyta's internal Key-Value (KV) storage to track sync states. By combining these, you ensure that no file is missed and no recording is transcribed twice, saving you both time and API credits.

How the workflow operates

The process starts by checking your last sync state to see what has changed since the last run. It looks back at a specific window of time and loads any pending transcriptions that are still in progress. The flow then lists recent files in your Google Drive folder and batches them to manage system load effectively.

For every batch, the automation separates standard documents from media files. It classifies your recordings, plans the necessary downloads, and triggers the transcription process. Finally, the system updates its internal records, saves the new sync state, and stores the metadata for your files. This step-by-step approach ensures that even large folders are indexed accurately and reliably.

Key benefits for your team

  • Automatic organisation: Files keep their original names and are stored in a dedicated, private synced folder.
  • Searchable media: Turn your video and audio recordings into text instantly, making your internal knowledge base easy to search.
  • Smart syncing: The workflow uses a lookback window and batching to ensure reliability, even with large volumes of data.
  • Reduced manual work: Stop downloading and re-uploading files between different platforms or manually triggering transcriptions.
  • Efficient processing: The system tracks pending jobs and only reconciles what is necessary, preventing duplicate costs.

Steps

  1. Normalize manual transcription ids (function)
  2. Resolve first sync lookback window (function)
  3. Load last sync state (kv)
  4. Load pending AssemblyAI transcriptions (kv)
  5. Select pending transcriptions to reconcile (function)
  6. Collect file ids already queued for transcription (function)
  7. List recent AssemblyAI transcriptions (http)
  8. Index pending transcript summaries (function)
  9. List files in Google Drive folder (http)
  10. Pause before next page (wait)
  11. Select oldest file batch for this run (function)
  12. Plan non-media file sync items for this batch (function)
  13. Select audio and video files in this batch (function)
  14. Classify recording files for transcription (function)
  15. Plan media downloads for transcription (function)
  16. Summarize recording processing (function)
  17. Summarize transcript reconciliation (function)
  18. Save pending AssemblyAI transcriptions (kv)
  19. Build Drive file metadata records for this batch (function)
  20. Build next sync state (function)
  21. Save last sync state (kv)

FAQ

How does the Google Drive file sync and transcription workflow work?

This automation monitors a specific Google Drive folder and mirrors its contents into Breyta. It keeps your files organised by saving them to a private folder while maintaining their original filenames.

Which integrations are required for this flow?

The flow uses Google Drive to source your files and AssemblyAI to convert audio and video into text. These tools work together to ensure your media is both stored and indexed for search.

Does this workflow handle audio and video transcription automatically?

When the flow detects new audio or video files, it automatically sends them to AssemblyAI for processing. Once the transcription is finished, the text is reconciled and stored alongside your original file metadata.

How does the automation keep track of which files have already been synced?

The system uses a Key-Value storage method to remember the last sync state and keep track of pending transcriptions. This ensures that no files are missed or duplicated, even if you upload large batches of media at once.