티스토리 뷰
spactrack에서 저장된 spac 목록들을 파싱하고 엑셀에 저장하는 스크립트이다.
# get parsing spack stocks list and stored to excel
# python 3.8
import requests
from bs4 import BeautifulSoup
import xlsxwriter
# startrack's spack stocks list
url1 = "https://sheet2site-staging.herokuapp.com/api/v3/index.php/?search=&key=1F7gLiGZP_F4tZgQXgEhsHMqlgqdSds3vO0-4hoL6ROQ&e=1"
url2 = "https://sheet2site-staging.herokuapp.com/api/v3/load_more.php/?key=1F7gLiGZP_F4tZgQXgEhsHMqlgqdSds3vO0-4hoL6ROQ&template=Table%20Template&filter=&search=&e=1&is_filter_multi=true&length=99&page={}"
row = 0
columnlist = ["SPAC Ticker", "Name", "Status", "SPAC Target Focus", "Target Company(if Deal Announced)", "Prominent Leadership / Directors / Advisors", "Trust Value(from last filing)", "Market Cap", "Commons Price", "Commons % Change Previous Day", "Unit Price", "Warrant Price", "Unit & Warrant Details", "Estimated Unit Split Date", "Warrant Intrinsic Value", "IPO Date", "IPO Size(M)", "Underwriter(s)", "Estimated Completion Deadline Date", "% Progress to Deadline", "SEC Filings", "Tags"]
def CreateXlsx():
w_obj = xlsxwriter.Workbook('spack.xlsx')
worksheet = w_obj.add_worksheet()
return worksheet
def CloseXlsx(w_obj):
w_obj.close()
def TestEndofIndex():
for index in range(1, 10):
res = requests.get(url2.format(index))
if res.text == "end":
return index - 1
return False
def Parsing(index, worksheet):
# 1 page
global row
dummpylists = []
dummpylists2 = []
html = requests.get(url1).text
soup = BeautifulSoup(html)
for tag in soup.select('tbody tr'):
for tag2 in tag.findAll("td"):
dummpylists.append(tag2.getText())
row = row + 1
for col, dummy in enumerate(dummpylists):
worksheet.write(row, col, dummy)
col = col + 1
dummpylists = []
#cols.append(all_cols[i].findAll("td")[j].find(text=True).strip().encode('cp949'))
# 2~N page
for i in range(1, index+1):
html = requests.get(url2.format(i)).text
soup = BeautifulSoup(html)
for tag in soup.select('tr'):
for tag2 in tag.findAll("td"):
dummpylists2.append(tag2.getText())
row = row + 1
for col, dummy in enumerate(dummpylists2):
worksheet.write(row, col, dummy)
col = col + 1
dummpylists2 = []
w_obj = xlsxwriter.Workbook('spack.xlsx')
worksheet = w_obj.add_worksheet()
bold = w_obj.add_format({'bold': 1})
# write header in excel file
for i, header in enumerate(columnlist):
worksheet.write(0, i, header, bold)
r_TestEndofIndex = TestEndofIndex()
if r_TestEndofIndex != False:
Parsing(r_TestEndofIndex, worksheet)
w_obj.close()
else:
input("[ERROR] TestEndofIndex")
'프로그래밍 > Python' 카테고리의 다른 글
Pyinstaller 도구로 만들어진 파일(.exe) 디컴파일하기 (0) | 2018.07.26 |
---|---|
[binascii] unhexlify 함수로 헥스(Hex) 값과 문자열(Char)을 교체하기 (1) | 2018.01.17 |
정규표현식 전방탐색, 후방탐색 오류!?!? (0) | 2017.11.02 |
[Django]UnicodeEncodeError 발생 시 해결방법 (0) | 2017.10.09 |
[Django_Template] forloop 사용법 (0) | 2017.09.03 |
댓글
공지사항
최근에 올라온 글
최근에 달린 댓글
- Total
- Today
- Yesterday
링크
TAG
- 출처 : Do it 안드로이드 프로그래밍
- Kimsuky
- cuckoo-sandbox
- Servey
- Static Analysis Engine
- Flybits
- 악성코드
- us-cert
- keylogger
- malware
- 멋쟁이사자처럼 4기
- Yara
- VirusBulletin
- 한글악성코드
- infostealer
- AMSI
- 스피어피싱
- 해킹메일
- MS-Office
- 비트코인
- Cisco Talos
- Bisonal
- 위협정보공유
- idapython
- CVE-2018-0798
- .wll
- Decoding
- vuln
- CVE-2018-9375
- koodous
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
5 | 6 | 7 | 8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 | 31 |
글 보관함