A ideia do post não é explicar como funciona ou o que é o Docker, pois há inumeros tutorias e conteúdos na internet explicando muito bem o que é o Docker. Sendo assim, para rodar o Dockerfile apresentado abaixo em sua máquina, será preciso ter o Docker instalado previamente em seu ambiente.
Dockerfile
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
FROM ubuntu:14.04
MAINTAINER likang
#instalando python e scrapy
RUN apt-get update
RUN apt-get install -y python python-pip python-dev libxml2-dev libxslt-dev libffi-dev libssl-dev
RUN pip install lxml && pip install pyopenssl && pip install Scrapy && pip install service_identity
#instalando o git
RUN apt-get install -y git
#criando uma pasta para o projeto scrapy
RUN mkdir /scrapyguj
#clonando projeto
RUN cd /scrapyguj; git clone https://github.com/LeoCBS/guj.git
#rodando scrapy
WORKDIR /scrapyguj/guj
CMD ["scrapy", "crawl", "java", "-o items.json"]
Criando imagem a partir do Dockerfile:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
FROM ubuntu:14.04 | |
MAINTAINER likang | |
#instalando python e scrapy | |
RUN apt-get update | |
RUN apt-get install -y python python-pip python-dev libxml2-dev libxslt-dev libffi-dev libssl-dev | |
RUN pip install lxml && pip install pyopenssl && pip install Scrapy && pip install service_identity | |
#instalando o git | |
RUN apt-get install -y git | |
#criando uma pasta para o projeto scrapy | |
RUN mkdir /scrapyguj | |
#clonando projeto | |
RUN cd /scrapyguj; git clone https://github.com/LeoCBS/guj.git | |
#rodando scrapy | |
WORKDIR /scrapyguj/guj | |
CMD ["scrapy", "crawl", "java", "-o items.json"] |
Parâmetro -t informa o nome da imagem que será gerada (no caso 'scrapy').
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sudo docker build -t scrapy . |
Saída do comando de build:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sending build context to Docker daemon 98.3 kB
Sending build context to Docker daemon
Step 0 : FROM ubuntu:14.04
---> 2d24f826cb16
Step 1 : MAINTAINER likang
---> Using cache
---> 20b513cdff4a
Step 2 : RUN apt-get update
---> Using cache
---> 0b7c32020c8d
Step 3 : RUN apt-get install -y python python-pip python-dev libxml2-dev libxslt-dev libffi-dev libssl-dev
---> Using cache
---> 6405a0b87f8c
Step 4 : RUN pip install lxml && pip install pyopenssl && pip install Scrapy && pip install service_identity
---> Using cache
---> 6bc06c32dfae
Step 5 : RUN apt-get install -y git
---> Using cache
---> f86ccaba0d25
Step 6 : RUN mkdir /scrapyguj
---> Using cache
---> 9d2c1b35e294
Step 7 : RUN cd /scrapyguj; git clone https://github.com/LeoCBS/guj.git
---> Using cache
---> 0f6f7a80dbcb
Step 8 : WORKDIR /scrapyguj/guj
---> Using cache
---> 097cfa0924e0
Step 9 : CMD scrapy crawl java -o items.json
---> Using cache
---> 91b67f7a3b30
Step 10 : VOLUME /scrapyguj/guj
---> Running in 2f0384c56e97
---> 6dfd289a80e2
Removing intermediate container 2f0384c56e97
Successfully built 6dfd289a80e2
Rodando nosso Docker:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sending build context to Docker daemon 98.3 kB | |
Sending build context to Docker daemon | |
Step 0 : FROM ubuntu:14.04 | |
---> 2d24f826cb16 | |
Step 1 : MAINTAINER likang | |
---> Using cache | |
---> 20b513cdff4a | |
Step 2 : RUN apt-get update | |
---> Using cache | |
---> 0b7c32020c8d | |
Step 3 : RUN apt-get install -y python python-pip python-dev libxml2-dev libxslt-dev libffi-dev libssl-dev | |
---> Using cache | |
---> 6405a0b87f8c | |
Step 4 : RUN pip install lxml && pip install pyopenssl && pip install Scrapy && pip install service_identity | |
---> Using cache | |
---> 6bc06c32dfae | |
Step 5 : RUN apt-get install -y git | |
---> Using cache | |
---> f86ccaba0d25 | |
Step 6 : RUN mkdir /scrapyguj | |
---> Using cache | |
---> 9d2c1b35e294 | |
Step 7 : RUN cd /scrapyguj; git clone https://github.com/LeoCBS/guj.git | |
---> Using cache | |
---> 0f6f7a80dbcb | |
Step 8 : WORKDIR /scrapyguj/guj | |
---> Using cache | |
---> 097cfa0924e0 | |
Step 9 : CMD scrapy crawl java -o items.json | |
---> Using cache | |
---> 91b67f7a3b30 | |
Step 10 : VOLUME /scrapyguj/guj | |
---> Running in 2f0384c56e97 | |
---> 6dfd289a80e2 | |
Removing intermediate container 2f0384c56e97 | |
Successfully built 6dfd289a80e2 |
Parâmetro -d é o nome da imagem
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sudo docker run -d scrapy |
Verificando log de saída do scrapy
Primeiro vamos rodar o comando "ps" para pegar o id do nosso Docker:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ sudo docker ps | |
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES | |
cac0e1987b5d scrapy:latest "scrapy crawl java ' 7 seconds ago Up 6 seconds jolly_hopper |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ sudo docker logs -f cac0e1987b5d |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ sudo docker logs -f cac0e1987b5d | |
2015-03-11 19:44:24+0000 [scrapy] INFO: Scrapy 0.24.5 started (bot: guj) | |
2015-03-11 19:44:24+0000 [scrapy] INFO: Optional features available: ssl, http11 | |
2015-03-11 19:44:24+0000 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'guj.spiders', 'REDIRECT_MAX_TIMES': 5, 'FEED_URI': ' items.json', 'SPIDER_MODULES': ['guj.spiders'], 'BOT_NAME': 'guj', 'TELNETCONSOLE_ENABLED': False, 'FEED_FORMAT': 'json', 'DOWNLOAD_DELAY': 600} | |
2015-03-11 19:44:24+0000 [scrapy] INFO: Enabled extensions: FeedExporter, LogStats, CloseSpider, WebService, CoreStats, SpiderState | |
2015-03-11 19:44:24+0000 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats | |
2015-03-11 19:44:24+0000 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware | |
2015-03-11 19:44:24+0000 [scrapy] INFO: Enabled item pipelines: GujPipeline, JsonWithEncodingPipeline | |
2015-03-11 19:44:24+0000 [java] INFO: Spider opened | |
2015-03-11 19:44:24+0000 [java] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) | |
2015-03-11 19:44:24+0000 [scrapy] DEBUG: Web service listening on 127.0.0.1:6080 | |
2015-03-11 19:44:25+0000 [java] DEBUG: Crawled (200) <GET http://www.guj.com.br/?p=0> (referer: None) | |
2015-03-11 19:44:25+0000 [java] DEBUG: url: http://www.guj.com.br/?p=0 | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Atualizar combo sem fechar jframe '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Indicar Caminho SDK no Eclipse. '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'ORMLite n\xe3o insere valores ? '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'D\xfavida: \xc9 necess\xe1rio baixar as bibliotecas do hibernate se estiver usando o Jboss WildFly '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'como gerar numeros pares aleat\xf3rio? '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Como chamar um relat\xf3rio Jaspersoft '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'metodo update Java '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Falha ao utilizar campo de pesquisa pelo Spring MVC e JpaRepository '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Reflex\xe3o: Objetos mut\xe1veis e imut\xe1veis '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Salvar dados em duas tabelas diferentes - Swing '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Pegar IP da m\xe1quina e salvar no banco de dados '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Pegar Ip da rede e setar a uma vari\xe1vel. '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Como habilitar a op\xe7\xe3o de minimizar no NetBeans? '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Editar uma jTextPane1. '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'problema para subir aplica\xe7\xe3o '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Java - SimpleDateFormat com Enum '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Descobrir erro em aplica\xe7\xe3o no celular '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Select com JavaScript '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'CommunicationsException '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Imprimir pelo client-side jsf '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Melhor forma de comunica\xe7\xe3o entre Aplica\xe7\xe3o e Servidor '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Carrinho como adicionar quantidade em um List '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'XStream - com Banco de Dados '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'como posso cria um id do edittext para edita a porta pela tela do app '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Resultados diferentes entre execu\xe7\xe3o e modo debug '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'como colocar meu menu com cor diferente um do outro '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'conectando jdbc ao banco de dados mysql '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Pagina\xe7\xe3o PHP,... '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'D\xfavida que com certeza algu\xe9m aqui sabe e j\xe1 passou por ela '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Como pegar os valores da div e mandar para uma var em php? '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'problema com p:dialog do primefaces '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Transa\xe7\xe3o em EJBs em diferentes jars '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Realizar soma negativa SQL '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Erro no arquivo de configura\xe7\xe3o do hibernate '} | |
2015-03-11 19:44:25+0000 [java] DEBUG: Scraped from <200 http://www.guj.com.br/?p=0> | |
{'title': u'Como melhorar um algoritmo que encontra o maior e o menor inteiro de um vetor unidimensional? '} | |
2015-03-11 19:45:24+0000 [java] INFO: Crawled 1 pages (at 1 pages/min), scraped 35 items (at 35 items/min) | |
2015-03-11 19:46:24+0000 [java] INFO: Crawled 1 pages (at 0 pages/min), scraped 35 items (at 0 items/min) |
Até a próxima!
Nenhum comentário:
Postar um comentário