Yet another bootstrap theme.

2015-05-30

Largest rectangle in instagram 解题报告

题目

题目链接

思考

方法1: O(n^2)的遍历做法.

很简单, 一个个rectangle area计算出来, 然后枚举/排序/找最大.
这里outer loop来抓每一个area的height, 然后分别从它向左/右找第一个比他小的数. 然后计算面积. 在计算的过程中update ans.
这里向左向右的while loop找lhs/rhs可以追述到Algs4的Binary search, Quick Sort的partition. 或者是Maximum subarray的Divide & Conquer的做法.
这里很重要的一点: 还是: 数列问题的index. 这里的终止条件是i = -1 || i = length. 这样的话, 可以统一起来lhs总是比他小的那个位置, rhs总是比他大的那个位置. 从而width = rhs-lhs-1.

方法2: Stack的使用: O(1)找左右第一个比他小的数

Stack又一个用法: 一个严格单调递增的stack. 它的左边第一个就是就是比它小的第一个数.
九章课上说, 可以有2个方法:
1. 正着跑一边incStk, 找到他的左边第一个比他小的位置, 然后reverse array, 再跑一遍stack, 找到右边第一个比他小的位置. (这和方法1有什么区别? 有的, 这个是O(n)找到所有点的左右范围.)
2. 也可以是九章答案的做法, 就是for里面一个while, 来维持单调递增性, 若违反了, 则说明找到了右边界, 因为其左侧第一个就是第一个比他小的数的位置.
但是: 虽然意思懂了, 但是一做发现很多错误! 我第一次在Lintcode上面写就出现了一个很好的错误: 如果[1,1]的话怎么弄? 应该输出2, 但我得到1! 这是因为我的width计算不对.
- 有个很重要的点: 例如[1], 然后cur = 0. 那么右边界找到了, 接着pop, 于是incStk = []. 那么左边界是多少呢? 我当时直接i-l-1. 其中l是incStk.peek(). 其实这样不对. 也导致了我的area是1. 因为stack为空, 说明height[pop()]的左边界是通到i = 0! 即width = i.
- 上面的这个错误的原因还有一个! 就是我在while loop判断的时候用的是”<”, 也就是维护的是一个单调不递减stack, 而不是单调递增stack. 这样的后果是, incStk出现了[1,1], 然后cur = 0. 接着pop位于第二个的1. 算出来的不知道是有什么物理意义. 所以正确的做法是while loop的判断用的是<=. 从而保证单调递增, 即左边就是比它小的数.

方法3: Dynamic programming: 凡是找极值都应该想想动态规划.

Elementary

1. exhaustive search

2. Greedy

3. DP I

4. Data Structure

5. Graph

6. Math trick

*. Problems

Intermediate

1. Binary Search

2. Tricks I

3. Data Structure

4. DP II

5. Max flow/Min cut

6. jihe

*. Problems

High

1. Math Problems

2. Game Strategies

3. Graph Theory

4. Tricks II

5. Clever search

6. Divide and Conquer

7. Elegant String Algs

*. problems

题目

题目链接

思考

first trial: fail

这里我还是对DP的状态没搞清楚. 做DP之前, 乃至所有coding之前要想好再动手!
对Substring不熟. 对string不熟. s.substring(a,b+1)表示的是string[a…b]. 所以s.substring(0, s.length()+1)才是返回整个string的.
String的DP表的边界往往是DP[s.length()+1]. 因为DP[0] = true: 表示: 选择string的0个元素也是符合条件的. 然后结果就是返回DP[s.length()]表示选择s的所有元素.
这里的optimally equation搞了很久. 其实公式是知道的, 但是一直不对. 其实就是: DP[i] = DP[j], j < i && s.substring(j,i+1)在dict内.

代码如下: 第一次写的时候, i,k这样写的话感觉不是很通用. 是在for loop statement的地方写好range, 还是在loop里面+1? 这里因为我的i的范围是1~s.length()

  public static boolean wordBreak1st(String s, Set<String> dict) {
    if (s == null || s.length() == 0) {
      return true;
    }
    boolean[] F = new boolean[s.length() + 1];
    F[0] = true;
    for (int i = 1; i < s.length(); ++i) {
      for (int k = 0; k < i; ++k) {
        if (k == 0) {
          if (F[k] == true && dict.contains(s.substring(k, i + 1))) {
            // System.out.println(s.substring(k, i + 1) + "," + k + "+" + i);
            F[i] = true;
          }
        } else if (F[k] == true && dict.contains(s.substring(k + 1, i + 1))) {
          F[i] = true;
          // System.out.println(s.substring(k + 1, i + 1) + "," + k + "+" + i);
          break;
        }
      }
    }
    // for (boolean b : F)
    // System.out.print(b + " ");
    return F[s.length() - 1];
}

2nd trial: TLE

模仿Code Ganker的写法, 特别是边界的定义. 因为如果是像第一次那样定义的话, k还要分2种情况. 代码很麻烦, 特别是对于index处理越多越容易乱或者出错
注意这里的i的范围是一般向的0…s.length()-1. 而j的范围是0…i. 注意这个i是包括的. 因为s.substring(k, i+1). 所以k最大应该取到i, 这样substring就是s[i].
这时候的时间复杂度是O(n^2). 不过如同SOF link: substring()是O(n)操作. 所以其实是: O(n^3)的时间复杂度.
就算这样, 还是TLE!

代码如下:

public static boolean wordBreak2nd(String s, Set<String> dict) {
   if (s == null || s.length() == 0) {
     return true;
   }
   boolean[] F = new boolean[s.length() + 1]; // string DP always leave 0th as no select always
                                              // true, so the DP space is len+1.
   F[0] = true;
   for (int i = 0; i < s.length(); ++i) { // notice the init and end of i for states
     for (int k = 0; k <= i; ++k) {
       if (F[k] == true && dict.contains(s.substring(k, i+1))) {
         F[i+1] = true;
       }
     }
   }
   return F[s.length()];
 }

4th trial: AC

看了九章的参考答案. 发现这时候没有的变成O(n). 那只有prune了. 注意到, F[k]==true的时候, 其实并不用判断所有substring(k, i+1). 因为如果这个substring长于dict内单词的长度就可以不用考虑了. 而且通常来说: word的长度最多25左右. 即i-k+1<=maxlen, 即k>=i-1-maxlen.(注意这里的是i, 不是i+1) 所以可以缩小一点k的取值范围.

代码如下:

public static boolean wordBreak2nd(String s, Set<String> dict) {
   if (s == null || s.length() == 0) {
     return true;
   }
   boolean[] F = new boolean[s.length() + 1]; // string DP always leave 0th as no select always
                                              // true, so the DP space is len+1.
   F[0] = true;
   for (int i = 0; i < s.length(); ++i) { // notice the init and end of i for states
     for (int k = 0; k <= i; ++k) {
       if (F[k] == true && dict.contains(s.substring(k, i+1))) {
         // System.out.println(s.substring(k, i) + "," + k + "+" + i);
         F[i+1] = true;
       }
     }
   }
   // for (boolean b : F)
   // System.out.print(b + " ");
   return F[s.length()];
 }

总结

DP问题一定要想清楚state和optimally equation在写代码.
loop的范围, 以及得到state的方式一定要设计好. 例如这里的是F[i+1]还是F[i]就很大差别. 因为这里要表示的是: 取string前i个元素是成立的. 所以substring(k, i+1), 然后F[i+1]. 这样就是i个元素的state成立, 于是接下来就可以F[k] && substring. 因为这样substring的下确界k正好是kth元素. eg: F[4] && substring(4,7).
1
2
3
4
5
l i n t c o d e
0 1 2 3 4 5 6 7

F[4] = true, substring(4, 7+1) => F[7+1] = true. 对于lintcode的例子.
F[7] = true, substring(7, 7+1) => F[7+1] = true. 对于ProgramCreek的例子.

题目

思路

问题

first trial : TLE

如果使用简单的recursion会出现TLE, 代码如下:

public class Solution {
    /**
     * @param triangle: a list of lists of integers.
     * @return: An integer, minimum path sum.
     */

    public int minimumTotal(ArrayList<ArrayList<Integer>> triangle) {
        // write your code here
        int[] min = new int[]{Integer.MAX_VALUE};
        dfs(0, 0, 0, min, triangle);
        return min[0];
    }


    public static void dfs(int x, int y, int sum, int[] min, ArrayList<ArrayList<Integer>> triangle) {
        if (x == triangle.size()) {
            min[0] = Math.min(min[0], sum);
            return;
        }
        dfs(x+1, y, sum+triangle.get(x).get(y), min, triangle);
        dfs(x+1, y+1, sum+triangle.get(x).get(y), min, triangle);
        return;
    }
}

Lintcode里面TLE的页面如下:
有几个要注意的地方:
1. 因为Lintcode是给了class, 要求写method. 虽然可以有helper method, 例如这里的dfs. 但是因为triangle不是field, 只是一个param, 所以必须要传给helper当param. 不然用不了.
2. Lintcode 的 test case是看不到全部的… 所以也不能copy放到eclipse里面算了. 但是有个基本常识就是一般不能有超过20层的recursion. 所以尽量不用recursion才行.

2nd trial: pass

看到这种Divide and Conquer的问题自然想到使用DP解决. 这样可以使用重复计算的部分, 而且避免递归. 注意到: 这是一个找min的值的问题. 自然是DP

学习了经典的DP

还有local, global最优解DP

以及变动边界的loop

题目

题目链接](http://www.lintcode.com/en/problem/jump-game/)
好的test case: {0,8,2,0,9},{5, 8, 3, 0, 6, 7, 9, 6, 3, 4, 5, 2, 0, 6, 2, 6, 7, 10, 8, 0}

思考

first trial: fail

状态: boolean F[i] 表示能否从起点到达position i.
一开始我的optimally equation想错了: 我想的是: F[i] = F[k] && A[k]>= abs(k-i). 这里的k是0…A.length. 在一开始的test case可以过, 但是{5, 8, 3, 0, 6, 7, 9, 6, 3, 4, 5, 2, 0, 6, 2, 6, 7, 10, 8, 0}就错了. 应该是true, 但我返回false.
- 这里有2个错误, 一开始我想的是会不会有先跳到远的, 然后跳回来近的, 再跳到终点. 实际上是不会的, 因为这里的A[i]表示的是从position i, 能跳的最远距离, 而不是能跳的距离(像大富翁). 所以k不应该loop全部node, 而是小于i的所有node.
- 更严重的错误是: F[i]表示的是是否能从起点到达. 所以只要能到达, 就OK了. 也就是说: 只要true, 就停止loop. 这也是导致我的这个code fail的原因.

second trial: OK

这一次, 我还是DP, 不过改好了optimally equation, 即只要发现F[i]==true就break loop.
但是这时候的时间是O(n**2). 因为要loop每一个node, 每一个node的内循环最坏情况是loop完所有小于i的k. 所以n+n-1+n-2+...+1 = O(n^2). 而且空间是O(n).

third trial: Failed

这里还是使用Greedy最好, 因为只要最远距离超过position就行了, 那就看每一步能跳到的最远距离就行了. 实现的时候是使用经典的: local/global optimal. 再次复习一下这种使用2个DP表来解决问题的DP方法. 即local DP记录: 以目前node出发能得到的最优解. Global则是记录历史到目前node位置能得到的最优解.
还要复习一下Greedy, 我之前的理解不对:
- 错误: 之前想的是, Greedy不就是从位置1跳最远距离, 例如跳到位置5, 然后再从位置5跳最远距离. 觉得这种Greedy太Greedy了. 很有可能, 位置5的A[5]=0. 那就没得跳了.
- 实际上, Greedy是针对每一个node都考虑一遍Greedy. 即: 1-5, 2-4-7, 3-9-dest.
实现上面, Ganker只用了O(1)的空间. 我则是使用通用的DP表, 记录每一个node的state.

代码

public static boolean jumpLocalGlobalDP(int[] A) {
    if (A == null || A.length == 0)
      return false;
    int[] local = new int[A.length];
    int[] global = new int[A.length];

    // init base condition
    local[0] = A[0];
    global[0] = A[0];

    // build the local/global DP table
    for (int i = 1; i < A.length; ++i) {
      local[i] = A[i] + i;
      global[i] = Math.max(global[i-1], local[i]);
    }

    // ans
    if (global[A.length-1] >= A.length - 1) {
      return true;
    }
    return false;
}

Forth trial: AC

刚才的做法是错误的, 例如{0,8,2,0,9}.
仔细研究一下Ganker的写法, 他并不是简单的loop, 而是会改变boundary的loop. 这个很有意思. 在使用一个queue实现的BFS模版里面, 有个很重要的点就是queue.size()不能作为boundary, 要int size = queue.size(). 因为queue会在loop里面改变大小. 而这里之所以要使用变动的boundary是因为, 如果当前node只考虑能打到的情况. 例如这个例子里面, node 1是达不到的.

所以很重要的一点是loop的范围应该是:

1	for (int i = 1; i <= global[i-1] && i < A.length; ++i)

2015-05-23

Unit7-9chap-Graph

2015-05-23

Unit6-9chap-List

今天继续讲Dynamic Programming.

后2个常见的DP类型题: Two Sequence DP和Backpack.

Circular Queue的作用

在SRAM里面相当于是无限大的buffer. 实现基本都是下图所示:
. 其中enqueue在Head, dequeue在tail. 且Head指向下一个插入的位置, Tail指向下一个可以读取的位置. 这样每次进来, 只要判断是否wrap, full. 然后就可以读写了.
在一般来说是作为asynch thread communication. 见CS105: Harvey Mudd
实现方式: github-Synchronization-CircularQueue.

Circular Queue的区别

wiki已经写了circular queue的几种实现方式, 区别在于判断empty/full的方式:
1. head/tail+1 empty slot
  - 这个实现在: Ring Buffer_Anders Kaloer
  - Scott的Firmware里面的也是这样做的.
2. head/tail+count
  - 一般都是用这个写, 因为简单, 而且semaphore其实就使这个count. 表示的是: 当前queue里面保存的item数量.
  - 我参考的Harshkn的写法. 改了一点. 主要是count在满了之后就不动了. 然后在push/pop里面加入了判断error的代码, 来自于kqchen.

实现代码

1. 使用Count

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/**
 * https://gist.github.com/harshkn/909546
 * C Implementation of simple circular buffer, Functionality provided are add to queue, read latest
 *
 * Implementation: head, tail, and count. The major different between AndersKaloer/Harshkn is the way to detect full/empty.
 * The former trade with an empty slot, the latter use additional int->count. I review this because this is easier and Scott uses it too.
 */

typedef struct circular_buffer {
	void *buffer;     // data buffer
	void *buffer_end; // end of data buffer
	size_t capacity;  // maximum number of items in the buffer
	size_t count;     // number of items in the buffer
	size_t sz;        // size of each item in the buffer
	void *head;       // pointer to head
	void *tail;       // pointer to tail
} circular_buffer;

void cb_init(circular_buffer *cb, size_t capacity, size_t sz) {
	cb->buffer = malloc(capacity * sz);
	if (cb->buffer == NULL) {
		// handle error
	}
	cb->buffer_end = (char *) cb->buffer + capacity * sz;
	cb->capacity = capacity;
	cb->count = 0;
	cb->sz = sz;
	cb->head = cb->buffer;
	cb->tail = cb->buffer;
}

void cb_free(circular_buffer *cb) {
	free(cb->buffer);
	// clear out other fields too, just to be safe
}

void cb_push_back(circular_buffer *cb, const void *item) {
	if (cb->count == cb->capacity) {
		printf("%s, %d: cb push back error. \n", __FILE__, __LINE__);
//		return;
	}
	memcpy(cb->head, item, cb->sz);
	printf("push data = %d\n", *(char *) cb->head);
	cb->head = (char*) cb->head + cb->sz;
	if (cb->head == cb->buffer_end)
		cb->head = cb->buffer;
	cb->count = cb->count == cb->capacity ? cb->count : cb->count + 1;
}

void cb_pop_front(circular_buffer *cb, void *item) {
	if (cb->count == 0) {
		printf("%s, %d: cb_pop_front: empty cb.\n", __FILE__, __LINE__);
		return;
	}
	memcpy(item, cb->tail, cb->sz);
	printf("pop data = %d\n", *(char *) cb->tail);
	cb->tail = (char*) cb->tail + cb->sz;
	if (cb->tail == cb->buffer_end)
		cb->tail = cb->buffer;
	cb->count--;
}

int cb_empty(circular_buffer *cb) {
	if (cb->count == 0)
		return 1;
	return 0;
}

int cb_full(circular_buffer *cb) {
	return cb->count == cb->capacity;
}

void printbuf(circular_buffer *cb) {
	int i = 0;
	for (; i < cb->capacity; ++i)
		printf("%d ", *(char *) (cb->buffer + i));
}

/**
 * https://github.com/kqchen/workspace/blob/21ab1c18dbf6c674437b0c89bb313ddbfc2de85c/hello/circular.c
 */
void cb_print(circular_buffer *cb) {
	int cnt = cb->count;
	do {
		if (cb->count == 0) {
			printf("%s, %d: cb is empty now.\n", __FILE__, __LINE__);
			return;
		}
		printf("cb->tail=0x%x, cb->head=0x%x, cnt = %d\n", cb->tail, cb->head,
				cnt);
		cnt--;
	} while (cnt > 0);
}
/**
 * testing the ring buffer
 */
int main() {
	circular_buffer tony; // a holder
	cb_init(&tony, 4, 1);
	int i = 10;
	for (; i < 20; ++i) {
		cb_push_back(&tony, &i);
//		printf("%d with count: %d, data = %d\n", cb_full(&tony), tony.count, i);
	}

	printf("\nafter filling ---\n");
	cb_print(&tony);
//	printf("after filling ---\n");
//	printbuf(&tony);
	char result;
	int j = 0;
	for (; j < 6; ++j) {
		cb_pop_front(&tony, &result);
//		printf("%d with count: %d, result = %d\n", cb_empty(&tony), tony.count, result);
	}
	printf("\nafter pop ---\n");
	cb_print(&tony);
	return 0;
}

Report: Interleaving String

标签（空格分隔）： Lintcode DP Ladder

题目

题目链接

思考

first trial: fail

F[i][j]表示是否可以用S1, S2的前i,j个字符表示S3的前i+j个字符. 既然是M x N的DP, 自然要先init 第一行和第一列.
然后就是optimally equation. 就是每次看S3的i+j-1个字符是否等于S1的i-1或者S2的j-1. 如果是, 就沿用F[i-1][j]或F[i][j-1].

但是这样fail在第15/18个test case:

["sdfjas;dfjoisdufzjkndfasdkfja;sdfa;dfa;dfaskdjhfasdhjdfakhdgfkajdfasdjfgajksdfgaksdhfasdkbfjkdsfbajksdfhakjsdfbajkdfbakdjsfgaksdhgfjkdsghfkdsfgadsjfgkajsdgfkjasdfh", "dfnakdjnfjkzghdufguweygfasjkdfgb2gf8asf7tgbgasjkdfgasodf7asdgfajksdfguayfgaogfsdkagfsdhfajksdvfbgkadsghfakdsfgasduyfgajsdkfgajkdghfaksdgfuyadgfasjkdvfjsdkvfakfgauyksgfajkefgjkdasgfdjksfgadjkghfajksdfgaskdjfgasjkdgfuyaegfasdjkfgajkdfygadjskfgjkadfg", "sdfjas;dfjoisdfnakdjnfjkzghdufguwdufzjkeygfasjkdfgb2gf8asf7ndtgbgasjkdfgasodf7asdfgfajkasdksdfguayfgaogfsdkagfsfjadhfajksdvfbgkadsghfa;sdkdsfgasduyfgajsdkfgafajkdghfaksdgfuyadgfas;dfjkdvfjsdkvfakfgauyksa;dgfajkefgjkdasgfdjksffaskdjhfasdhjdfakhdgadjkghfajgfkajdfksdfgaskdjfgasjkdgfuasdjfgajksdfgaksdhfasdkbfjkdsfbajksdfyaegfasdjkfgajkdfygadjskfgjkadfghakjsdfbajkdfbakdjsfgaksdhgfjkdsghfkdsfgadsjfgkajsdgfkjasdfh"]

应该return false, 但我return true.
找了好久, 终于发现问题了: 我的递推方程是:

if (s3.charAt(i + j - 1) == s1.charAt(i - 1)) {
  F[i][j] = F[i - 1][j];
} else if (s3.charAt(i + j - 1) == s2.charAt(j - 1)) {
  F[i][j] = F[i][j - 1];
}
else {
  F[i][j] = false;
}

注意这里: 如果s1.charAt(i - 1)和s3.charAt(i + j - 1)和s2.charAt(j - 1)相等的话, F[i][j]是给else if覆盖了. 所以有时候对有时候错. 所以是我的optimally equation写错了. 这里是or的关系.

2nd trial: AC

这次解决了这个bug. 使用OR来得到F[i][j].

3rd trial: fail

虽然刚才那样做是可以AC, 但是空间复杂度太大! 因为这个只和上一层跟左一列有关系. 所以可以用1d的rolling array来做. 那为什么这里和distinct subsequences的1d不同, 是自左向右循环呢? 分析如下:
update的时候要看是否使用更新前还是更新后的上一层的信息.
- 例如distinct subsequences就是使用更新前的上一层信息, 所以内循环是自后往前.
- 而像Interleaving string的1d rolling array就是自前向后.
- 这2个有什么区别呢: 区别在于optimally equation, 看到底需要上一层的哪个数, 如果是上一层的左边, 那么就不能覆盖的循环, 所以要自后向前. 而且若是同时需要上一层左侧和这一层左侧的话就不能用1d滚动数组了, 用mod rolling.
- 总结见下图:
Ganer就是这样写的. 但是我一开始先是像distinct substring那样写: 即内循环是从后向前来update State. 结果是错的.
- 所以这里有个重要事情. 什么时候是自前向后, 什么时候自后向前. 这里并不是说Top-down, Bottom-up. 而是说1d rolling array里面是使用更新前还是更新后的值.
  
  每次迭代我们只保存了上一行的信息。接下来从前往后还是从后往前就取决于我们要用的是更新前还是更新后的信息，如果从前往后，我们会使用更新后的信息

4th trial : AC

这题和distinct substring 的 rolling array的内层循环不同的是: 这里是从前往后.

Wikirize my knowledge

2015-05-30 Largest rectangle in instagram 解题报告

题目

思考

方法1: O(n^2)的遍历做法.

方法2: Stack的使用: O(1)找左右第一个比他小的数

方法3: Dynamic programming: 凡是找极值都应该想想动态规划.

2015-05-26 Algs-Japan-toc

Elementary

1. exhaustive search

2. Greedy

3. DP I

4. Data Structure

5. Graph

6. Math trick

*. Problems

Intermediate

1. Binary Search

2. Tricks I

3. Data Structure

4. DP II

5. Max flow/Min cut

6. jihe

*. Problems

High

1. Math Problems

2. Game Strategies

3. Graph Theory

4. Tricks II

5. Clever search

6. Divide and Conquer

7. Elegant String Algs

*. problems

2015-05-25 Report: Word Break

题目

思考

first trial: fail

2nd trial: TLE

4th trial: AC

总结

2015-05-24 Report: Triangle Min Path Sum

题目

思路

问题

first trial : TLE

2nd trial: pass

2015-05-24 Report: Jump Game I

题目

思考

first trial: fail

second trial: OK

third trial: Failed

Forth trial: AC

2015-05-23 Unit7-9chap-Graph

2015-05-23 Unit6-9chap-List

2015-05-22 Unit 5 NineChap : Dynamic programming - part I

2015-05-22 CircularQueue

Circular Queue的作用

Circular Queue的区别

实现代码

1. 使用Count

2015-05-20 Interleaving String 解题报告

Report: Interleaving String

题目

思考

first trial: fail

2nd trial: AC

3rd trial: fail

4th trial : AC

Categories

Tag Cloud

Recent Posts

Links

2015-05-30

Largest rectangle in instagram 解题报告

2015-05-26

Algs-Japan-toc

2015-05-25

Report: Word Break

2015-05-24

Report: Triangle Min Path Sum

2015-05-24

Report: Jump Game I

2015-05-23

Unit7-9chap-Graph

2015-05-23

Unit6-9chap-List

2015-05-22

Unit 5 NineChap : Dynamic programming - part I

2015-05-22

CircularQueue

2015-05-20

Interleaving String 解题报告