Long Luo's Life Notes

每一天都是奇迹

By Long Luo

This article is the solution 3 Approaches: Sorting, Two Pointers and Reverse Two Pointers of Problem 88. Merge Sorted Array .

Here shows 3 Approaches to slove this problem: Sorting, Two Pointers and Reverse Two Pointers.

Sorting

Let the array \(\textit{nums}_2\) into the rear of array \(\textit{nums}_1\), and then sort the entire array.

1
2
3
4
5
6
7
public static void merge_sort(int[] nums1, int m, int[] nums2, int n) {
for (int i = 0; i < n; i++) {
nums1[m + i] = nums2[i];
}

Arrays.sort(nums1);
}

Analysis

  • Time Complexity: \(O((m + n)log(m+n))\) .
  • Space Complexity: \(O(log(m+n))\) .

Two Pointers

Since two arrays \(\textit{nums}_1\) and \(\textit{nums}_2\) are both sorted in non-decreasing order, so we can use the Two Pointers method.

We consider the two arrays as two queues, takes the smaller numbers from the two arrays headers each round and put it into our results.

阅读全文 »

By Long Luo

This article is the solution Java HashSet Solution of Problem 2215. Find the Difference of Two Arrays.

Intuition

We can use \(\texttt{HashSet}\) to record the distinct integers of the two arrays and judge whether a distinct integer in \(\textit{nums}_1\) which not present in \(\textit{nums}_2\).

We can only use \(2\) \(\texttt{HashSet}\), not \(4\) \(\texttt{HashSet}\).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
public List<List<Integer>> findDifference(int[] nums1, int[] nums2) {
Set<Integer> set1 = new HashSet<>();
Set<Integer> set2 = new HashSet<>();
for (int num : nums1) {
set1.add(num);
}
for (int num : nums2) {
set2.add(num);
set1.remove(num);
}

for (int num : nums1) {
set2.remove(num);
}

List<List<Integer>> ans = new ArrayList<>();
ans.add(new ArrayList<>(set1));
ans.add(new ArrayList<>(set2));
return ans;
}
阅读全文 »

By Long Luo

This article is the solution 3 Approaches: Brute Force, Recursion and Two Pointers of Problem 680. Valid Palindrome II.

Here shows 3 Approaches to slove this problem: Brue Force, Recursion and Two Pointers.

Brue Force

Let’s start from the simplest method:

  • if \(len(str) \le 2\), definitely return \(\textit{true}\);
  • if the string is a palindrome, return \(\textit{true}\);
  • if not, enumerate each position as the deleted position, and then check the remaining strings is it a palindrome.

Time complexity of this approach is \(O(n^2)\), time limit will be exceeded.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
public static boolean validPalindrome_bf(String s) {
int len = s.length();
if (len <= 2 || validPalindrome(s, 0, len - 1)) {
return true;
}

for (int i = 0; i < len; i++) {
String str = s.substring(0, i) + s.substring(i + 1, len);
if (validPalindrome(str, 0, str.length() - 1)) {
return true;
}
}

return false;
}

public static boolean validPalindrome(String s, int left, int right) {
int len = s.length();
if (left >= len || right < 0 || left > right) {
return false;
}

while (left < right) {
if (s.charAt(left) != s.charAt(right)) {
return false;
}
left++;
right--;
}

return true;
}

Analysis

  • Time Complexity: \(O(n^2)\)
  • Space Complexity: \(O(1)\)
阅读全文 »

By Long Luo

微博热搜爬虫

前几天外甥女要我帮她完成一个小作业,用 Python 完成一个爬虫,于是在网上找了点资料1 和 Github 上找了个爬虫2 的例子,改简单了点,只爬取 微博热搜 数据并存储到 Excel 中,使用 Python Jupyter 编写。

具体代码见:https://github.com/longluo/spider/blob/master/weibohot.ipynb 。

微博爬虫

微博爬虫 3 的代码及功能太多太复杂,但我目前只需要爬取下列信息:

  1. 个人信息
  2. 具体时间段全部博文;
  3. 将数据存入 Excel 和数据库中。

所以需要精简其代码。

微博API

微博获取个人信息API:https://weibo.cn/1565668374/info

微博获取个人全部博文API:https://weibo.cn/1565668374?page=1

具体代码见:https://github.com/longluo/spider/blob/master/weibo.py 。

博客爬虫

在写完上述两个爬虫之后,趁热打铁,也把之前自己一直想做完的功能做完了!

功能

  1. 获取某网站的全部博文比如 https://www.longluo.me/ 。
  2. 爬取内容:文章标题、发布时间、分类、链接、正文(HTML格式)等。

分析

  1. 向目标网页发出请求

Header 是请求头信息的,添加的键值对越多,目标网站就越认为你是真实的用户,但我的网站没有反爬虫措施,所以 Header 里面就只有 User-Agent 键值对,用来描述你的操作系统、浏览器等信息。

  1. 获取网站博客的文章标题、发布时间和链接;
  2. 数据写入Excel中;
  3. 数据存入数据库中。

实现

获取全部博文标题、发布时间和链接

全部博文列表位于 https://www.longluo.me/archives/ 中,首先我们要获取的是总共有多少页呢?

总页数

查看网页源代码,我们可以看见全部页数为 <span class="space">&hellip;</span><a class="page-number" href="/archives/page/48/">48</a>,所以我们只需要解析出来即可。

参考网络代码4 ,利用BeautifulSoup解析HTML,然后利用正则表达式匹配,代码如下所示:

1
2
3
4
5
6
7
8
9
10
def get_total_page(self):
html = self.get_html(self.archivesUrl)

page_number = re.compile('[0-9]+')

soap = BeautifulSoup(html, 'html.parser')

page_item = soap.find_all('a', class_="page-number")
total_pages = re.findall(page_number, str(page_item))[-1]
return int(total_pages)
阅读全文 »

By Long Luo

This article is the solution Simple Two Pointers Solution of Problem 125. Valid Palindrome.

Very easy, we can use Two Pointers to solve this problem.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
public boolean isPalindrome(String s) {
if (s == null || s.length() <= 1) {
return true;
}

int n = s.length();
int left = 0;
int right = n - 1;
while (left < right) {
while (left < n && left < right && !(Character.isDigit(s.charAt(left)) || Character.isAlphabetic(s.charAt(left)))) {
left++;
}

while (right > 0 && left < right && !(Character.isDigit(s.charAt(right)) || Character.isAlphabetic(s.charAt(right)))) {
right--;
}

if (Character.toUpperCase(s.charAt(left)) != Character.toUpperCase(s.charAt(right))) {
return false;
}

while (left < right && Character.toUpperCase(s.charAt(left)) == Character.toUpperCase(s.charAt(right))) {
left++;
right--;
}
}

return true;
}
阅读全文 »
0%