Try Live
Add Docs
Rankings
Pricing
Enterprise
Docs
Install
Theme
Install
Docs
Pricing
Enterprise
More...
More...
Try Live
Rankings
Create API Key
Add Docs
Tableau Scraper
https://github.com/bertrandmartel/tableau-scraping
Admin
Tableau Scraper is a Python library that enables extraction of data from Tableau visualizations
...
Tokens:
6,736
Snippets:
60
Trust Score:
9.2
Update:
1 month ago
Context
Skills
Chat
Benchmark
94.7
Suggestions
Latest
Show doc for...
Code
Info
Show Results
Context Summary (auto-generated)
Raw
Copy
Link
# Tableau Scraper Tableau Scraper is a Python library designed to extract data from Tableau visualizations programmatically. It provides a comprehensive set of tools to interact with Tableau dashboards, including retrieving worksheet data as pandas DataFrames, applying filters and parameters, selecting items, navigating story points, and downloading data in various formats. The library works by interacting with Tableau's internal APIs, making it possible to scrape data from public Tableau visualizations without manual intervention. The library supports both client-side and server-side rendered Tableau dashboards, offering methods to handle complex dashboard interactions. Key features include worksheet data extraction, filter and parameter manipulation, item selection, story point navigation, drill-up/drill-down operations, CSV and crosstab data downloads, and tooltip rendering. It uses requests for HTTP communication, BeautifulSoup for HTML parsing, and pandas for data handling, providing a seamless way to integrate Tableau data into Python data workflows. ## TableauScraper Class The main entry point for scraping Tableau visualizations. Initializes a scraper instance with configurable logging level and request delay, then loads a Tableau URL to begin data extraction. ```python from tableauscraper import TableauScraper as TS import logging # Initialize scraper with custom settings ts = TS(logLevel=logging.DEBUG, delayMs=500, verify=True) # Load a Tableau visualization URL url = "https://public.tableau.com/views/PlayerStats-Top5Leagues20192020/OnePlayerSummary" ts.loads(url) # Get the workbook containing all worksheets workbook = ts.getWorkbook() # Access all worksheets and their data for worksheet in workbook.worksheets: print(f"Worksheet: {worksheet.name}") print(worksheet.data) # Returns pandas DataFrame print("---") ``` ## Get Specific Worksheet Retrieve a single worksheet by name directly from the TableauScraper instance, returning a TableauWorksheet object with the data as a pandas DataFrame. ```python from tableauscraper import TableauScraper as TS url = "https://public.tableau.com/views/PlayerStats-Top5Leagues20192020/OnePlayerSummary" ts = TS() ts.loads(url) # Get a specific worksheet by name ws = ts.getWorksheet("ATT MID CREATIVE COMP") # Access the data as a pandas DataFrame print(ws.data) # Get column names from the worksheet columns = ws.getColumns() print(f"Available columns: {columns}") ``` ## Select Items from Worksheet Select interactive items within a worksheet to trigger data updates. First retrieve selectable items, then select a specific value to get the updated dashboard data. ```python from tableauscraper import TableauScraper as TS url = "https://public.tableau.com/views/PlayerStats-Top5Leagues20192020/OnePlayerSummary" ts = TS() ts.loads(url) ws = ts.getWorksheet("ATT MID CREATIVE COMP") # Get all selectable items with their columns and possible values selectable_items = ws.getSelectableItems() for item in selectable_items: print(f"Column: {item['column']}") print(f"Values: {item['values'][:5]}...") # Show first 5 values # Get values for a specific column values = ws.getSelectableValues("ATTR(Player)") print(f"Available players: {values}") # Select a value and get updated dashboard updated_dashboard = ws.select("ATTR(Player)", "Vinicius Júnior") # Access worksheets from the updated dashboard for worksheet in updated_dashboard.worksheets: print(f"Worksheet: {worksheet.name}") print(worksheet.data) ``` ## Set Parameters Get available parameters from a workbook and set parameter values to dynamically change the visualization data. ```python from tableauscraper import TableauScraper as TS url = "https://public.tableau.com/views/PlayerStats-Top5Leagues20192020/OnePlayerSummary" ts = TS() ts.loads(url) workbook = ts.getWorkbook() # Get all available parameters parameters = workbook.getParameters() for param in parameters: print(f"Column: {param['column']}") print(f"Parameter Name: {param['parameterName']}") print(f"Values: {param['values']}") print("---") # Set a parameter value updated_workbook = workbook.setParameter("P.League 2", "Ligue 1") # Access updated worksheet data for worksheet in updated_workbook.worksheets: print(f"Worksheet: {worksheet.name}") print(worksheet.data) # Alternative: Use inputParameter directly for API requests updated_workbook = workbook.setParameter( inputName=None, value="Ligue 1", inputParameter="[Parameters].[P.League (copy)_1642969456470679625]" ) ``` ## Set Filters Apply categorical filters to worksheets to narrow down data. Supports single values, multiple values, and various filter modes. ```python from tableauscraper import TableauScraper as TS url = "https://public.tableau.com/views/WomenInOlympics/Dashboard1" ts = TS() ts.loads(url) # Get a specific worksheet ws = ts.getWorksheet("Bar Chart") print("Original data:") print(ws.data) # Get all available filters for the worksheet filters = ws.getFilters() for f in filters: print(f"Column: {f['column']}") print(f"Values: {f['values']}") print(f"Current Selection: {f['selection']}") print("---") # Apply a single filter value updated_workbook = ws.setFilter('Olympics', 'Winter') # Get the filtered worksheet data filtered_ws = updated_workbook.getWorksheet("Bar Chart") print("Filtered data:") print(filtered_ws.data) # Apply multiple filter values updated_workbook = ws.setFilter('Olympics', ['Winter', 'Summer']) # Use dashboard filter API instead of categorical-filter-by-index updated_workbook = ws.setFilter('Olympics', 'Winter', dashboardFilter=True) # Use filter-delta mode (unselect all except specified value) updated_workbook = ws.setFilter('Olympics', 'Winter', filterDelta=True) # Specify filter indices directly updated_workbook = ws.setFilter('Olympics', [], indexValues=[0, 1, 2]) # Skip membership target in request updated_workbook = ws.setFilter('Olympics', 'Winter', membershipTarget=False) ``` ## Navigate Story Points List and navigate through story points in dashboards that contain storyboard presentations. ```python from tableauscraper import TableauScraper as TS url = "https://public.tableau.com/views/EarthquakeTrendStory2/Finished-Earthquakestory" ts = TS() ts.loads(url) workbook = ts.getWorkbook() # Get all available story points story_points = workbook.getStoryPoints() print(f"Storyboard: {story_points['storyBoard']}") for story_group in story_points['storyPoints']: for story in story_group: print(f"ID: {story['storyPointId']}, Caption: {story['storyPointCaption']}") # Navigate to a specific story point story_point_workbook = workbook.goToStoryPoint(storyPointId=10) # Get worksheet names in the story point worksheet_names = story_point_workbook.getWorksheetNames() print(f"Worksheets in story point: {worksheet_names}") # Access worksheet data from the story point timeline_data = story_point_workbook.getWorksheet("Timeline").data print(timeline_data) ``` ## Level Drill Up/Down Navigate through hierarchical data levels in visualizations that support drill-up and drill-down operations. ```python from tableauscraper import TableauScraper as TS url = "https://tableau.azdhs.gov/views/ELRv2testlevelandpeopletested/PeopleTested" ts = TS() ts.loads(url) workbook = ts.getWorkbook() sheet_name = "P1 - Tests by Day W/ % Positivity (Both) (2)" # Drill down through levels (zoom in) level1 = workbook.getWorksheet(sheet_name).levelDrill(drillDown=True, position=0) print("Level 1 (after first drill down):") print(level1.getWorksheet(sheet_name).data) # Continue drilling down level2 = level1.getWorksheet(sheet_name).levelDrill(drillDown=True, position=0) print("Level 2 (after second drill down):") print(level2.getWorksheet(sheet_name).data) level3 = level2.getWorksheet(sheet_name).levelDrill(drillDown=True, position=0) print("Level 3 (after third drill down):") print(level3.getWorksheet(sheet_name).data) # Drill up (zoom out) - set drillDown=False level2_again = level3.getWorksheet(sheet_name).levelDrill(drillDown=False, position=0) print("Back to Level 2:") print(level2_again.getWorksheet(sheet_name).data) ``` ## Download CSV Data Download full data as CSV for dashboards that have the download feature enabled. Returns data as a pandas DataFrame. ```python from tableauscraper import TableauScraper as TS url = "https://public.tableau.com/views/WYCOVID-19Dashboard/WyomingCOVID-19CaseDashboard" ts = TS() ts.loads(url) workbook = ts.getWorkbook() # Download CSV data for a specific sheet csv_data = workbook.getCsvData(sheetName='case map') print(csv_data) # Some Tableau servers use different API prefixes # Default is "vudcsv", but some use "vud" csv_data = workbook.getCsvData(sheetName='worksheet1', prefix="vud") # Export data as DataFrame and save to file if csv_data is not None: csv_data.to_csv('tableau_data.csv', index=False) print("Data saved to tableau_data.csv") ``` ## Download Cross Tab Data Download crosstab data for dashboards that have the crosstab export feature enabled. ```python from tableauscraper import TableauScraper as TS url = "https://tableau.soa.org/t/soa-public/views/USPostLevelTermMortalityExperienceInteractiveTool/DataTable2" ts = TS() ts.loads(url) workbook = ts.getWorkbook() # Optionally set parameters before downloading workbook = workbook.setParameter(inputName="Count or Amount", value="Amount") # Download crosstab data for a specific sheet crosstab_data = workbook.getCrossTabData( sheetName="Data Table 2 - Premium Jump & PLT Duration" ) print(crosstab_data) # Save to CSV if data was retrieved if crosstab_data is not None: crosstab_data.to_csv('crosstab_export.csv', index=False) ``` ## Navigate to Different Sheets List all available sheets (visible and hidden) and navigate between different dashboard views. ```python from tableauscraper import TableauScraper as TS url = "https://public.tableau.com/views/COVID-19VaccineTrackerDashboard_16153822244270/Dosesadministered" ts = TS() ts.loads(url) workbook = ts.getWorkbook() # Get list of all sheets with metadata sheets = workbook.getSheets() for sheet in sheets: print(f"Sheet: {sheet['sheet']}") print(f" Is Dashboard: {sheet['isDashboard']}") print(f" Is Visible: {sheet['isVisible']}") print(f" Subsheets: {sheet['namesOfSubsheets']}") print(f" Window ID: {sheet['windowId']}") print("---") # Navigate to a different sheet by name new_sheet_workbook = workbook.goToSheet("NYC Adults") # Access worksheets from the new sheet for worksheet in new_sheet_workbook.worksheets: print(f"Worksheet name: {worksheet.name}") print(worksheet.data) ``` ## Render Tooltip Get tooltip HTML content by simulating mouse hover at specific coordinates. Useful for server-side rendered dashboards where data is not directly accessible. ```python from tableauscraper import TableauScraper as TS url = "https://public.tableau.com/views/CMI-2_0/CMI" ts = TS() ts.loads(url) workbook = ts.getWorkbook() # Get a specific worksheet ws = workbook.getWorksheet("US Map - State - CMI") # Render tooltip at specific x, y coordinates # Coordinates correspond to pixel positions on the visualization tooltip_html = ws.renderTooltip(x=387, y=196) print(tooltip_html) # Parse tooltip HTML to extract data from bs4 import BeautifulSoup if tooltip_html: soup = BeautifulSoup(tooltip_html, 'html.parser') # Extract text content from tooltip tooltip_text = soup.get_text(separator='\n') print(tooltip_text) ``` ## Get Downloadable Summary and Underlying Data Retrieve summary data or full underlying data from worksheets using Tableau's data export APIs. ```python from tableauscraper import TableauScraper as TS url = "https://public.tableau.com/views/PlayerStats-Top5Leagues20192020/OnePlayerSummary" ts = TS() ts.loads(url) workbook = ts.getWorkbook() ws = workbook.getWorksheet("ATT MID CREATIVE COMP") # Get summary data (aggregated data shown in the visualization) summary_data = ws.getDownloadableSummaryData(numRows=200) print("Summary Data:") print(summary_data) # Get underlying data (raw data behind the visualization) underlying_data = ws.getDownloadableUnderlyingData(numRows=200) print("Underlying Data:") print(underlying_data) ``` ## Command Line Usage Use the included scripts for quick exploration of Tableau dashboards without writing code. ```bash # Clone the repository git clone git@github.com:bertrandmartel/tableau-scraping.git cd tableau-scraping/scripts # Get worksheets data from a Tableau URL python3 prompt.py -get workbook -url "https://public.tableau.com/views/PlayerStats-Top5Leagues20192020/OnePlayerSummary" # Interactive selection of items python3 prompt.py -get select -url "https://public.tableau.com/views/MKTScoredeisolamentosocial/VisoGeral" # Interactive parameter setting python3 prompt.py -get parameter -url "https://public.tableau.com/views/COVID-19DailyDashboard_15960160643010/Casesbyneighbourhood" ``` ## Handling Server-Side Rendered Dashboards For dashboards using server-side rendering, data cannot be extracted directly. Use filter iteration or tooltip rendering to extract data from these visualizations. ```python from tableauscraper import TableauScraper as TS # Example for server-side rendered dashboard url = "https://example-ssr-tableau.com/views/Dashboard" ts = TS() ts.loads(url) ws = ts.getWorksheet("Main Chart") # Check if filters use client-side rendering filters = ws.getFilters() # Iterate through all filter values to extract data all_data = [] for filter_value in filters[0]['values']: # Apply filter updated_wb = ws.setFilter(filters[0]['column'], filter_value) updated_ws = updated_wb.getWorksheet("Main Chart") # If data is available, collect it if not updated_ws.data.empty: updated_ws.data['filter_value'] = filter_value all_data.append(updated_ws.data) # Combine all extracted data import pandas as pd if all_data: combined_data = pd.concat(all_data, ignore_index=True) print(combined_data) ``` ## Summary Tableau Scraper provides a comprehensive solution for programmatic data extraction from Tableau visualizations. The primary use cases include automated data collection from public dashboards for analysis, monitoring dashboard changes over time, integrating Tableau data into Python data pipelines, and extracting data from interactive visualizations without manual export. The library handles authentication sessions, manages API request timing, and provides pandas DataFrame outputs for seamless integration with data analysis workflows. When integrating Tableau Scraper into applications, start by initializing the TableauScraper class and loading the target URL, then use getWorkbook() or getWorksheet() to access data. For interactive dashboards, chain operations like setFilter(), setParameter(), and select() to navigate to specific data views before extraction. Handle server-side rendered dashboards by iterating through filter values or using tooltip rendering. Configure the delayMs parameter to respect rate limits and prevent API throttling. The library's architecture allows for building robust data pipelines that can adapt to changes in Tableau dashboard configurations while maintaining reliable data extraction capabilities.