Awesome-like Window Switching

Motivation Differing from Windows and MacOS, Linux is highly customizable for its desktop environment. In the desktop environment, window manager plays a crucial role in controlling the placement and appearance of windows and the interaction when you switch, preview, or hide windows. Unconsciously, these interactions through window manager steal your time especially when you use multiple windows on multiple screens using mouse extensively. Developers prefer to work in this mode. Fortunately, some excellent window managers are developed to alleviate this issue. Awesome is a representative one. However, naively using Awesome on laptop disables my favorite touchpad gestures tool, libinput-gestures. Thus,...…

Share this Post

Review for Image Captioning

Table of Contents Image Captioning Conventional Approaches Deep Learning Approaches m-RNN LRCN Show and Tell (NIC) Show, Attend and Tell Image Captioning Image captioning aims at describe an image using natural language. It’s a quite challenging task in computer vision because to automatically generate reasonable image caption, your model have to capture the global and local features, recognize objects and their relationships, attributes and the activities, ect. Plus, language model requires a lot work to generating grammar error-free sentences. Image captioning is useful precise image retrieval, and might help blind people sense the world as well. For example, Facebook uses...…

Share this Post

Object Detection Review

Table of Contents The Problem Description Conventional Approaches Face Detection Pedestrian Detection HOG Features Deformable Part Model (DPM) Deep Learning Approaches R-CNN SPP-net Fast R-CNN Faster R-CNN The Problem Description Object detection is the process that deal with detecting instances of semantic objects of a certain class in digital images and videos. The object recoognition task degrades into a object detection task if we know what we are looking for. If we apply a recognition algorithm to every possible sub-window in given image, it’s likely to be both slow and error-prone. More effective approach is constructing a specified detector that...…

Share this Post

How to Learn Computer Vision Well

Kick off the Game! As a master student with machine learning background, it’s common to think about new domain like computer vision (CV), natural language processing (NLP), and speech recognition. These three domains are treated as the basement of general AI. Giant companies like Google, Facebook and Microsoft make great investment into these domains. Because I was always dream to build my personal intelligent robot and the CV plays a crucial role in robotics, I began my computer vision research journey without hesitation. What is Computer Vision? According to Wikipedia, computer vision is an interdisciplinary field that deals with how...…

Share this Post

Arch Linux Infinite Boot Loop

Infinite Boot Loop after Upgrade I began to use Arch Linux since March, 2016 when my Ubuntu crashed again. Although, Ubuntu is one of the standard working environment for many open source projects, it sometimes does crash unpredictably if you did some normal and right operations as you think. Most importantly, you are hesitate to upgrade system version or not and may do a lot to try new software or library features. After investigation, I decided to use Arch Linux, a full rolling-release Linux distribution. I had a painful installation procedure, but it’s worthy and everything is under my control....…

Share this Post

kNN Classify Handwritten Digits

kNN Intuition As a common nonparametric learning algorithm, the intuition behind kNN is pretty simple. For every unclassified test point, find k nearest neighbors in the training dataset. Then predict the class of the test point according to the classes of these k nearest neighbors. To be summary, it’s a kind of geometric intuition for prediction. kNN Distance Metric The most widely used distance metric is $L_p$ distance. The distance between $x_i$ and $x_j$ is: in which $x_i$ and $x_j$ both have $n$ dimensions. Specifically, when $p = 2$, it’s Euclidean distance; when $p = 1$, it’s Manhattan distance. Because...…

Share this Post