2015-08-27

Egon the rabbit and me

``Do you have a voice?'', I spontaneously asked a rabbit, when I was reading a economy book. ``It seems people's productivity is increasing, the world population is saturating, why does the world still have a lot of problems?''

The rabbit was looking at me. His name is Egon. Today is Stig's first day of school. ``You don't know why?'' Egon said so, or maybe a part of me told me as Egod told me. I answered, ``Yes, I don't know. I can say human being is stupid, but, that answer doesn't help me even if it is true.'' Egon kept silence, so I thought it was some audible hallucination.
Egon

I looked at the cage. ``Is your cage too small for you?'' I asked Egon.

Egon: ``It's small. I rather want to have large one. How is yours?''

I: ``... Mine? I don't have one.''

Egon: ``Don't you see yours? You made it yourself, and you put yourself in it. As far as I know, only human being does that.''

I: ``Do you mean jail? Usually no one wants to put themselves in it.''

Egon: ``No. don't you build a cage, called a city?''

I: ``... How a city can be a cage?''

Egon: ``Before you build a city. Everyone can sit anywhere. Now you can only freely sit your small part. Some people even doesn't have own place. Most of the city you cannot sit freely. Maybe you don't see the physical cage, but, what is the difference?''

I: ``... I see, interesting. I haven't think about the city in that way.''

Egon: ``Not only the physical ones. You build a system called capitalism or whatever. Then you put yourself in it. Even if you think it will be collapse or is dangerous, people like to put themselves in it. You even afraid to lose your cage.''

I: ``...''

Egon and me

Egon stopped talking here. I start realizing I am actually in a cage. No, I am in multiple cages. A city, political system, energy system, language I can use, ... I thought these are for better life, but Egon told me all of them can be limitation... a cage. I thought the human being is the third clever animal on the earth, but now I am thinking we are behind the rabbit, so at least forth.

Egon: ``You might overrated your life too much. You could find some balance.''

I: ``I overrated it too much. We made nations, army, political system, economy system, energy system, etc. to protect us, but actually they could also destroy us. Do you think this is our problem?''

Egon: ``Maybe so, maybe not.''

Sun is going down. Stig took out Egon from his cage and put him into his sleeping cage.  ``Egon, do you have a voice, like audible voice?'', I asked again. ``Is it really important? You take care the overrating something.'', he answered.

He was busy to eat his vegetable in the cage. ``Did he mean this is a kind of another cage? I always want to hear some audible voice? That is actually not so important?'' I am not sure at that time, but I think I saw Egon was smiling.

2015-08-10

Semi-automate timing generation method of video subtitles

Abstract

I voluntarily work on for free mathematics material translation for everyone. I have three main tasks in my workflow of this work: 1. script translation on a srt file, 2. dubbing the video, 3. subtitle generation. I found the subtitle timing generation is a time consuming task, so I want to reduce this. When I generate a subtitle, I already have the translated script and its video sound. So, I try to use these data to semi-automate the subtitle timing generation. This time I use the YouTube's transcript function to generate the subtitle timing. This can reduce the time of timing generation task. I implemented a srt file to text file conversion script since YouTube's transcript function requires text format data. YouTube's transcript function performs  not only the timing generation, it also edit the lines (put some newlines). Therefore, I implemented subtitle line concatenation script, too. One experiment shows that whole manual work took 4.5 hours to generate the subtitle timing for a 13 minutes video. With this method took around 3 hours to generate for similar length video. These scripts are published with new BSD license, so anyone can use freely.

Semi-automate timing generation method of video subtitles

I voluntarily work on for free mathematics material translation for everyone. These videos explain ``why a fraction division make the fraction upside down and multiply it?'' or ``In the first place, what  the meaning of division by a fraction? I know what is divided by  means, but what is the meaning of divided by 2/3?'' I also  subtitles for these videos, but it took a lot of time to generate subtitle timing. For example, once I took four and half hours to generate the subtitle timing for a 13 minutes video.

I voluntarily work on for free mathematics material translation for everyone. These videos explain ``why a fraction division make the fraction upside down and multiply it?'' or ``In the first place, what  the meaning of division by a fraction? I know what is divided by  means, but what is the meaning of divided by 2/3?'' I also  subtitles for these videos, but it took a lot of time to generate subtitle timing. For example, once I took four and half hours to generate the subtitle timing for a 13 minutes video.

But, when I generate the subtitle timing, I have already a translated script and the voice. I try to use these data to generate the subtitle timings as much as possible. This is a SubdayResearch theme this time.

One of my friend made a software that analyze mp3 file by FFT and get the rhythm from the file. He has an input device of a game, ``dance dance revolution'', however, he didn't have the game software. So, he wrote a game software to use the device. I first thought I needed to analyze the video file to generate the timing. Thus I discuss with him. However, he suggested me that first I should search such software, maybe I could find some free software to do that. In my case, I need Japanese voice analysis.

I search a subtitle generation software, and found some including YouTube's functionality. I found a software that generates many language subtitles. I read the document of it and found this software first generate the video language's subtitle by YouTube's automatic subtitle generation functions, then uses the Google translate to generate the other language subtitles.

As an experiment, I tried YouTube's automatic subtitle generation function. But, I could not get enough precise result by my voice. The precision of the timing seems fine, but the text quality is not. However, I only need the timing information, since I have already translated script. If I could map only the timing of automatically generated subtitle to the manually translated script, it would work. So, we have the following ideas:
  • Can we search the corresponding strings between automatically generated subtitle and the manually translated script assuming some amount of error in the strings?
  • If we have corresponding points, can we minimize the distance to compensate the errors? This could be an optimization problem respect to the string distance.
However, when I checked some automatically generated subtitles, the strings have too much errors and it seems difficult to use this idea.

I assume voice analysis is a difficult task, so I try to avoid to do that. I would like to solve my problem as less effort as possible. Though, I will put some effort if it is really needed.

I continue the discussion a few times in our lunch break (The discussion of SubdayResearch is usually at lunch break or in a party), we realized that my real problem is not the subtitle generation, but the subtitle timing generation. So, I search again with ``subtitle timing generation,'' not ``subtitle generation.'' Then, I found that YouTube has transcript function. This function generates a subtitle from the video and its contents text. Currently 10 languages are supported to generate the timings by YouTube.

The input of the transcript is a text + alpha. I manually generated a text file from a srt file and tried this function. The result is enough precise to use. However, the text is cut sometimes, maybe it try to fit the text in some length. I need to remove this for the further processing. At the end, I fine turn the result by amara or Camstasia (both are the software that can adjust the subtitle timing manually).

In the end, I need two simple filter scripts for my workflow.
  • A filter converts a srt file to text format
  • A filter removes the subtitle newlines in a  srt file

Implementation

These filters a published at the following URL:

The license is new BSD license, so everyone can use freely.

Experimental result and conclusion


I made two videos, both length are around 13 minutes about multiplication table. I generate the subtitle by fully manually and it took 4 hours 27 minutes for the first video. I use this method for the second video, it took 2 hours 37 minutes for the second video.

I think one and half hours is a good time reduction for this video.

Future work


I would like to try this method for further video creations.

I also look for any other (simple/easy) methods to reduce video creation
time.

Acknowledgments

Thanks for Dietger, Daniel, Jörg for discussions and ideas.