Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
public:gsoc:2017:saurabh [2017/08/20 23:06]
saurabhshri Add smaller public gif.
public:gsoc:2017:saurabh [2018/01/06 20:03] (current)
willem ↷ Page moved and renamed from public:gsoc:gsoc2017:ccaligner_word_by_word_audio_subtitle_synchronisation_saurabh_shrivastava_gsoc_2017 to public:gsoc:2017:saurabh
Line 3: Line 3:
 // //
  
 +//Blog entry for final submission : (https://​saurabhshri.github.io/​gsoc-final-submission/​) //
 ---- ----
  
Line 31: Line 32:
 In the above example each word from subtitle is tagged with beginning and ending timestamps based on audio. In the above example each word from subtitle is tagged with beginning and ending timestamps based on audio.
  
-{{ :​public:​gsoc:​gsoc2017:​karaoke-480.gif?nolink |}}+{{ youtube>​38_27E1PxXA?large }} 
 +\\ 
  
 CCAligner makes use of automatic speech recognition to analyse audio and recognise words to perform alignment. The project comprises of both user friendly tool and developer friendly API. CCAligner makes use of automatic speech recognition to analyse audio and recognise words to perform alignment. The project comprises of both user friendly tool and developer friendly API.
Line 39: Line 41:
   * Project repository on Github: https://​github.com/​saurabhshri/​CCAligner   * Project repository on Github: https://​github.com/​saurabhshri/​CCAligner
  
-  * Project ​documentation ​: https://​github.com/​saurabhshri/​CCAligner/​blob/​master/​README.adoc+  * Project ​readme ​: https://​github.com/​saurabhshri/​CCAligner/​blob/​master/​README.adoc 
 + 
 +  * Project documentation : https://​github.com/​saurabhshri/​CCAligner/​blob/​master/​docs/​
  
   * My blog (includes weekly GSoC posts) : https://​saurabhshri.github.io   * My blog (includes weekly GSoC posts) : https://​saurabhshri.github.io
Line 55: Line 59:
 ===== Technical Documentation ===== ===== Technical Documentation =====
  
-All the technical details are commented in the codes and the documentation is available in the readme of the repository (linked above). Code is properly commented and the variables, classes and other components are named properly in Camel Case for easier understanding of the code. Find compiling, installing ​and usage instructions here :+All the technical details are commented in the codes and the documentation is available in the readme of the repository (linked above). Code is properly commented and the variables, classes and other components are named properly in Camel Case for easier understanding of the code. Find compiling, installingusage instructions ​and docs here :
  
   * https://​github.com/​saurabhshri/​CCAligner   * https://​github.com/​saurabhshri/​CCAligner
Line 67: Line 71:
   * Project repository : https://​github.com/​saurabhshri/​simple-yet-powerful-srt-subtitle-parser-cpp   * Project repository : https://​github.com/​saurabhshri/​simple-yet-powerful-srt-subtitle-parser-cpp
  
-  * Documentation : Complete documentation is in the readme file located in repository.+  * Documentation : https://​github.com/​saurabhshri/​CCAligner/​tree/​master/​docs
  
 2. Improving existing CCExtractor features, fixing issues and help in PR and code reviews. 2. Improving existing CCExtractor features, fixing issues and help in PR and code reviews.
Line 78: Line 82:
  
 4. Link to my Github profile : https://​github.com/​saurabhshri 4. Link to my Github profile : https://​github.com/​saurabhshri
 +
 +
 +===== Some Demostrations =====
 +
 +{{youtube>​6VnhC8u_d40?​small}} ​
 +  * Karaoke Demo 2 [Ted Talk]  ​
 +
 +\\  ​
 +
 +{{youtube>​j_zeixo-zJY?​small}} ​
 +  * Karaoke Demo 3 [Cartoon Show]  ​
 +
 +\\  ​
 +
 +{{youtube>​8tTDX6NZGsU?​small}} ​
 +  * Karaoke Demo 4 [Discussion Video]  ​
 +
 +\\  ​
 +
 +{{youtube>​tFrf0TVnqIQ?​small}} ​
 +  * Transcription Demo [Reality Show]  ​
 +
 +
 +\\   
 +
 +===== Third party libraries and dependencies =====
 +
 +All the third party libraries are located in `src/​lib_ext` and along with their individual licenses.
 +
 +1. PocketSphinx : PocketSphinx is a lightweight speech recognition engine. It is portable and is used in ASR based alignment. //​(https://​github.com/​cmusphinx/​pocketsphinx)//​
 +
 +2. SphinxBase : Basic libraries as well as some common utilities for manipulating acoustic feature and audio files. This is used by PocketSphinx. //​(https://​github.com/​cmusphinx/​sphinxbase)//​
 +
 +3. srtparser.h : srtparser.h is a single header, simple and powerful C++ srt subtitle parsing library that allows to easily handle, process and manipulate srt subtitle files. //​(https://​github.com/​saurabhshri/​simple-yet-powerful-srt-subtitle-parser-cpp)//​
 +
 +4. webRTC : WebRTC is a free, open project that provides browsers and mobile applications with Real-Time Communications (RTC) capabilities via simple APIs. It is used to perform VAD in the project. //​(https://​webrtc.org)//​
  
 ===== Known Issues / Future Work Needed ===== ===== Known Issues / Future Work Needed =====
Line 83: Line 123:
 The project is in it’s very early stage and is constantly evolving. The available functions, usage instructions et cetera are expected to refactor over time. Feel free to contribute and improve the project. The project is in it’s very early stage and is constantly evolving. The available functions, usage instructions et cetera are expected to refactor over time. Feel free to contribute and improve the project.
 Currently, officially only US English is supported. For other languages and accents, a proper trained acoustic model could be supplied and experimented with. Text tokenisation within the program needs improvement. Feel free to raise any issue in the repository'​s issue tracker : https://​github.com/​saurabhshri/​ccaligner/​issues Currently, officially only US English is supported. For other languages and accents, a proper trained acoustic model could be supplied and experimented with. Text tokenisation within the program needs improvement. Feel free to raise any issue in the repository'​s issue tracker : https://​github.com/​saurabhshri/​ccaligner/​issues
 +
 +===== Read More =====
 +
 +More information and news related to project could be found at the links attached above and would be posted from time to time on my blog : https://​saurabhshri.github.io
  • public/gsoc/2017/saurabh.1503270395.txt.gz
  • Last modified: 2017/08/20 23:06
  • by saurabhshri