Value driven web development

As you might know, we’ve released Ankoder.net for while now. It lets anyone to download videos into their iPods and various other formats. Scraping the flv url from Youtube HTML isnt exactly easy.

This is how we do it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
  def parse_youtube(url)
     youtube = "http://www.youtube.com/"
     # url =~ /(?:http\:\/\/.*youtube.com\/(?:watch\?v=|v\/))?(.*)$/
     url =~ /watch\?v=(.*)/ 
     video_id = $1
     video_id = video_id.split("&")[0]
     flv_url = nil
     open("#{youtube}watch\?v=#{video_id}") do |f|
       f.each_line do |line|
         if line =~ /watch_fullscreen\?(.*?)video_id=([\w-]+)&(.*?)&t=([\w-]+)&/
           # p line
           flv_url = "#{youtube}get_video?video_id=#{$2}&t=#{$4};auto"
           break
         end
       end
     end
     flv_url
  end

USER_AGENT = %{Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.11) 
Gecko/20071231 Firefox/2.0.0.11 Flock/1.0.5}

IO.popen("curl -o \"#{file_name}\" -L -A \"#{USER_AGENT}\" \"#{parse_youtube(url)}\"  2>&1")

As you know Rails does bad on handling file upload, a large file will block your Rails app a long while, make it busy on receiving the file and can’t give response to other visitors, make them upset and leave you alone.

One solution is using merb to handle file upload for rails. The latest Merb that build on Rack(a cool framework who help you dealing with all kinds of http servers) does a really good job on uploading.

First, install merb:


sudo gem install merb

Second, create a merb app in your rails dir:

1
2
    merb-gen app uploader
    cd uploader

You can ignore all other files except config/rack.rb, this is the only file we need to modify. Currently there’s only one line in the file:


    run Merb::Rack::Application.new

It will ask merb to handle the http request come from rack. Let us change this file to:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
require 'cgi'

class File
  def to_s
    path
  end
end

# build a new handler to handler rack's request
  class Uploader 
    def call(env)
      # leverage merb's utility to parse the request.
      # Merb will save the file to a tempfile and save the tempfile's path in request's param
      request = Merb::Request.new(env)

      params = request.params

      # pass the params directly to the real (rails) app
      result = post("http://someplace.com/api", hash_to_params(params)).split("\n")[-1]
     
      # processing result or just ignore it ...
    end

    private

    def post(url, params="")
      curl_cmd = "curl -H \"Content-type: application/x-www-form-urlencoded\" #{url} -d \"#{params}\""
      puts "curl_cmd = #{curl_cmd}"
      f = IO.popen(curl_cmd +" 2>&1")
      result = f.read
      f.close
      result
    end

    def hash_to_params(hash)
      hash.map do |k, v|
        if v.kind_of? Hash
          h = {}
          v.each { |kk, vv| h["#{k}[#{kk}]"] = vv }
          hash_to_params h
        else
          "#{k}=#{CGI.escape v.to_s}"
        end
      end.join("&")
    end

  end

run Uploader.new
Run merb by
merb -p 1234 -c 1 -e production -d

Remember to config your apache or your favorite webserver to redirect all request from /uploader to port 1234 (Your merb uploader is listening here!).

Pretty easy, isn’t it?

We've extended the Autocomplete.Local from Scriptaculous to implement the autocomplete to: field mimicing the Facebook's features. This was a little challenging at the start, but prototype.js and scriptaculous have just made it so much easier.


Concepts

  • json - array of contacts with names and email address and any other fields you wish to search
  • The input box changes size dynamically and reposition itself according to the keystrokes
  • Each 'token' created is an input field, submitting the id of the user or an email address

Straight to the source

Syntax


new Autocompleter.LocalAdvanced(id_of_text_field, id_of_div_to_populate, json_array, options)

The constructor takes four parameters. The first two are, as usual, the id of the monitored textbox, and id of the autocompletion menu. The third is an array of strings that you want to autocomplete from, and the fourth is the options block.

Extra local autocompletion options

Option Default Value Description
search_field "name" Which attribute to search in the json array.
choices 10 How many autocompletion choices to offer
partialSearch off If false, the autocompleter will match entered text only at the beginning of strings in the autocomplete array. Defaults to true, which will match text at the beginning of any word in the strings in the autocomplete array. If you want to search anywhere in the string, additionally set the option fullSearch to true
fullSearch false Search anywhere in autocomplete array strings.
partialChars 2 How many characters to enter before triggering a partial match (unlike minChars, which defines how many characters are required to do any match at all).
ignoreCase true Whether to ignore case when autocompleting

Example

HTML

1
2
3
4
5
6
7
8
9
10
 <div tabindex="-1" id="ids" class="clearfix tokenizer" onclick="$('autocomplete_input').focus()">
<span class="tokenizer_stretcher">^_^</span><span class="tab_stop"><input type="text" id="hidden_input" tabindex="-1"></span>
 <div id="autocomplete_display" class="tokenizer_input"> 
 <input type="text" size="1" tabindex="" id="autocomplete_input"   />
 </div>                                                                          
 </div>

<div id="autocomplete_populate" class="clearfix autocomplete typeahead_list" style="width: 358px; height: auto; overflow-y: hidden;display:none">
 <div class="typeahead_message">Type the name of a friend, friend list, or email address</div>                       
 </div>  

Javascript

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
               (new Image()).src='/inbox/images/token.gif';
                        (new Image()).src='/inbox/images/token_selected.gif';
                        (new Image()).src='/inbox/images/token_hover.gif';
                        (new Image()).src='/inbox/images/token_x.gif';

                        var contacts = [
                        {name:"phoenix zhuang",email:"phoenix@rorcraft.com"}, 
                        {name:"jian xie",email:"jan.xie@rorcraft.com"}, 
                        {name:"isaiah peng",email:"isaiah.peng@rorcraft.com"}, 
                        {name:"chris chan",email:"chris.chan@rorcraft.com"}, 
                        {name:"rex chung",email:"rex@rorcraft.com"},
                        {name:"chung rex",email:"chung@rorcraft.com"},
                        {name:"chan chris",email:"chan@rorcraft.com"},
                        {name:"peng isaiah",email:"peng@rorcraft.com"} ];


                        var typeahead = new Autocompleter.LocalAdvanced('autocomplete_input', 'autocomplete_populate', contacts, {                                                  
                            frequency: 0.1,
                            updateElement: addContactToList,
                            search_field: "name"
                        });
                        var hidden_input = new HiddenInput('hidden_input',typeahead);

CSS

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
/* autcompleter.advancedlocal css */
.tokenizer{min-height:5px;padding:0px 0px 3px 3px;width:100%;background:#fff;font-size:11px;}
.tokenizer_locked{background:#f4f4f4;}
.tokenizer,
.tokenizer *{cursor:text}
.tokenizer input{width:100%;}
.tokenizer .tokenizer_input,
.tokenizer .token{float:left;margin-right:3px;margin-top:3px;}
.tokenizer .tab_stop,
.tokenizer .tokenizer_stretcher{display:block;float:left;overflow:hidden;width:0px;}
.tokenizer .tab_stop{height:0px;}
.tokenizer .tokenizer_stretcher{padding-top:7px;}
#autocomplete_input{width:20px;}
#facebook .tokenizer .tab_stop input{border:0px solid black;display:inline;position:relative;left:-500px;}
#facebook .tokenizer .tokenizer_input_borderless {left:4px;margin-left:-1px;overflow:hidden;position:relative;}
#facebook .tokenizer_input_borderless  #autocomplete_input{border:3px solid white!important;border-left:none;display:block;margin:-3px 3px -4px -2px;padding:0px!important;}
/*IE6-/Win only*/
/*\*/ * html#facebook .tokenizer_input_borderless #autocomplete_input { border:3px solid black;margin: -3px 3px -4px 14px:padding-left:10px; } /**/
.tokenizer div:-moz-first-node{padding-top:1px!important;}
.tokenizer_input{max-width:450px;overflow:hidden;padding:1px 0px;}
#facebook .tokenizer_input input,
.tokenizer_input_shadow{border:0px solid black;outline:0;font-family:'lucida grande', tahoma, verdana, arial, sans-serif;font-size:11px;padding:0px 5px;margin:0 0 -1px 0;white-space:pre;}
.tokenizer_input_shadow{display:inline;left:-10000px;position:absolute;top:-10000px;}
.tokenizer .tokenizer_input_shadow{height:0px;display:block;left:0px;overflow:hidden;position:relative;top:0px;}
div.tokenizer .token{background-image:url('/inbox/images/token.gif');background-repeat:no-repeat;color:black;white-space:nowrap;}
div.tokenizer .token span{background-image:url('/inbox/images/token.gif');background-position:top right;background-repeat:no-repeat;display:block;}
div.tokenizer .token span span{background-position:bottom right;}
div.tokenizer .token span span span{background-position:bottom left;}
div.tokenizer .token span span span span{background-image:none;padding:2px 3px 2px 5px;}
div.tokenizer.tokenizer_locked .token span span span span{padding-right:5px;}
html div.tokenizer_locked .token:hover,
html div.tokenizer_locked .token:hover span{background-image:url('/inbox/images/token.gif');}
div.tokenizer .token:hover,
div.tokenizer .token:hover span{background-image:url('/inbox/images/token_hover.gif');text-decoration:none;}
div.tokenizer .token_selected,
div.tokenizer .token_selected span,
div.tokenizer .token_selected:hover,
div.tokenizer .token_selected:hover span{background-image:url('/inbox/images/token_selected.gif');color:white;text-decoration:none;}
div.tokenizer .token span.x,
div.tokenizer .token span.x_hover,
div.tokenizer .token:hover span.x,
div.tokenizer .token:hover span.x_hover{background-image:url('/inbox/images/token_x.gif');background-position:4px 2px;cursor:pointer;display:inline;padding:0px 6px 0px 5px;}
div.tokenizer.tokenizer_locked .token span.x,
div.tokenizer.tokenizer_locked .token span.x_hover{display:none;}
div.autocomplete {  position:absolute;  width:355px;  background-color:white;  border:1px solid #888;  margin-top:-2px;  padding:0px;}
div.autocomplete ul {  list-style-type:none;margin:0px; padding:0px;}
div.autocomplete ul li.selected { background-color: #ffb;}
div.autocomplete ul li {  list-style-type:none;  display:block;  margin:0;  padding:2px;  height:32px;cursor:pointer;}
/* end of autcompleter.advancedlocal css */

Download the full source

Autocomplete_AdvancedLocal.zip

[update] We've forked a version of the control.js and changed all reference to element.style to setStyle(). It was causing script errors in IE.
http://github.com/rorcraft/scriptaculous/tree/master/src/controls.js

Paperclip has some good features over attachment_fu that "attached files don’t need to have a seperare model (thank god). Your attachments are treated just like any other atribute. Images aren’t saved until your model is saved" ( by Jim Neath ).

The fall back of paperclip is that it tries to create a thumbnail for any type of file, including pdf. It won't cause much problem if it cannot create a thumbnail to a certain file. But when it comes to pdf file, paperclip tries to generate a thumbnail for every page of the file, it becomes very slow when uploading the pdf file if the file has hundreds of pages. Sometimes it even times out! At the same time, attachment_fu does not have this problem.

attachment_fu.rb includes a class method #images? to distinguish if the file is an image.

1
2
3
4
129   # Returns true or false if the given content type is recognized as an image.
130       def image?(content_type)
131         content_types.include?(content_type)
132       end

Content types of images are initialized at the begin of the file:


5  @@content_types = ['image/jpeg', 'image/pjpeg', 'image/gif', 'image/png', 'image/x-png', 'image/jpg']

Then the #image? method is called in a instantial method #thumbnailable:

1
2
3
4
5
6
7
8
9
10
216       # Checks whether the attachment's content type is an image content type
217       def image?
218         self.class.image?(content_type)
219       end
220
221       # Returns true/false if an attachment is thumbnailable.  
            # A thumbnailable attachment has an image content type and the parent_id attribute.
222       def thumbnailable?
223         image? && respond_to?(:parent_id) && parent_id.nil?
224       end

In the #create_or_update_thumbnail it test if the content is thumbnailable? to determine going or not.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
243   # Creates or updates the thumbnail for the current attachment.
244       def create_or_update_thumbnail(temp_file, file_name_suffix, *size)
245         thumbnailable? || raise(ThumbnailError.new("Can't create a thumbnail if the content \
 type is not an image or there is no parent_id column"))
246         returning find_or_initialize_thumbnail(file_name_suffix) do |thumb|
247           thumb.attributes = {
248             :content_type             => content_type,
249             :filename                 => thumbnail_name_for(file_name_suffix),
250             :temp_path                => temp_file,
251             :thumbnail_resize_options => size
252           }
253           callback_with_args :before_thumbnail_saved, thumb
254           thumb.save!
255         end
256       end

Paperclip has a similar structure like this, it makes it easier to make this patch.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
From line #217
    def post_process #:nodoc:
      return if @queued_for_write[:original].nil?
      logger.info("[paperclip] Post-processing #{name}")
      @styles.each do |name, args|
        begin
          dimensions, format = args
          dimensions = dimensions.call(instance) if dimensions.respond_to? :call
          @queued_for_write[name] = Thumbnail.make(@queued_for_write[:original],
                                                   dimensions,
                                                   format,
                                                   @whiny_thumnails)
        rescue PaperclipError => e
          @errors << e.message if @whiny_thumbnails
        end
      end
    end

The trick here is to add the #thumbnailable? method in right after the #begin keyword and raise a PaperclipError if it fails:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
213     def post_process #:nodoc:
214       return if @queued_for_write[:original].nil?
215       @styles.each do |name, args|
216         begin
                # Test here
217           thumbnailable? || raise(PaperclipError.new("Can not create thumbnails \
 if the content type is not an image."))
218           dimensions, format = args
219           dimensions = dimensions.call(instance) if dimensions.respond_to? :call
220           @queued_for_write[name] = Thumbnail.make(@queued_for_write[:original],-
221                                                    dimensions,
222                                                    format,-
223                                                    @whiny_thumnails)
224         rescue PaperclipError => e
225           @errors << e.message if @whiny_thumbnails
226         end
227       end
228     end

Add the #image? and #thumbnailable? method to attachment.rb of paperclip, and initialize the thumbnailable content types at the beginning. That's all, now you can upload pdf files very fast.

About how to use paperclip, Jim Neath has a great tutorial Paperclip: Attaching Files in Rails. Enjoy it!