Deploying an Azure Data Factory project as ARM Template

In my last post I wrote about how to Debug Custom .Net Activities in Azure Data Factory locally. This fixes one of the biggest issues in Azure Data Factory at the moment for developers. The next bigger problem that you will run into is when it comes to deploying your Azure Data Factory project. At the moment, you can only do it manually from Visual Studio which, for bigger projects, can take quite some time. So I extended and advanced the code from my CustomActivityDebugger. Well, actually I rewrote some major parts of it and moved it into a new GitHub repository: Azure.DataFactory.LocalEnvironment

The new code base now includes the functionality to export an existing ADF project to an ARM template which can then be deployed very easily using Azure standard deployment mechanisms.

So basically, these are the changes and new Features that I made:

Export as ARM template:

  • Export all ADF objects and properties
  • Support for configurations
  • obey dependencies between ADF objects
  • parameterized Data Factory name
  • automatic upload of ADF dependencies (e.g. custom activities)
  • specify the region where ADF should be deployed (ADF is not available in all regions yet!)

ADF_LocalEnvironment_DeployARMTemplate

Custom Activity Debugger:

  • simplified usability – just select the pipeline, activity and set the slice-dates
  • Support for configurations
  • no need to add any namespaces
  • no need to add any references
  • write activity log to console output

ADF_LocalEnvironment_DebugActivity_Breakpoing

General:

  • Load from the ADF Project file (.dfproj) instead of a whole folder
  • implemented as Assembly
  • can be used in a Console Application for automation
  • will be published via NuGet in the future! (coming soon)

 

Everything else is described in the Git-Repository itself!

Hope you enjoy it!

Monitoring Azure Data Factory using PowerBI

Some time ago Microsoft released the first preview of a tool which allows you to monitor and control your Azure Data Factory (ADF). It is a web UI where you can select your Pipeline, Activity or Dataset and check its execution state over time. However, from my very personal point of view the UI could be much better, especially much clearer(!) as it is at the moment. But that’s not really a problem as the thing I like the most about ADF is that its quite open for developers (for example Custom C#/.Net Activities) and it also offers a quite comprehensive REST API to control an manage it.
For our monitoring purposes we are mainly interested in the LIST interface but we could do basically every operation using this API. In my example I only used the Dataset API, the Slices API and the Pipeline API.

First we start with the Dataset API to get a list of all data sets in our Data Factory. This is quite simple as we just need to build our URL of the REST web service like this:

  1. https://management.azure.com/subscriptions/{SubscriptionID}/resourcegroups/{ResourceGroupName}/providers/Microsoft.DataFactory/datafactories/{DataFactoryName}/datasets?api-version={api-version}

You can get all of this information for the Azure Portal by simply navigating to your Data Factory and checking the URL which will be similar to this one:

  1. https://portal.azure.com/#resource/subscriptions/1234567832324a04a0a66e44bf2f5d11/resourceGroups/myResourceGroup/providers/Microsoft.DataFactory/dataFactories/myDataFactory

So this would be my values for the API Call:
– {SubscriptionID} would be “12345678-3232-4a04-a0a6-6e44bf2f5d11”
– {ResourceGroupName} would be “myResourceGroup”
– {DataFactoryName} would be “myDataFactory”
– {api-version} would be a fixed value of “2015-10-01”

Once you have your URL you can use PowerBI to query the API using Get Data –> From Web
Next you need to authenticate using your Personal or Organizational Account – the same that you use to sign in to the Portal – and also the level for which you want to use the credentials. I’d recommend you to set it either to the subscription level or to the data-factory itself, depending on your security requirements. This ensures that you are not asked for credentials for each different API:
ADF_PowerBI_Authentication

This works in a very similar way also for the Slices API, the Pipeline API and all other APIs available! The other transformations I used are regular PowerQuery/M steps done via the UI so I am not going to describe them in more detail here. Also, setting up the relationships in our final PowerPivot model should be straight forward.

Now that we have all the required data in place, we can start with our report. I used some custom visuals for the calendar view, some slicers and a simple table to show the details. I also used a Sankey Chart to visualize the dependencies between the datasets.

ADF_PowerBI_Monitoring_Dashboard
ADF_PowerBI_Monitoring_Dependencies

Compared to the standard GUI for monitoring this provides a much better overview of slices and their current states and it also allows easy filtering. I am sure there are a lot of other PowerBI visualizations which would make a lot of sense here, these are just to give you an idea how it could look like, but of course you have all the freedom PowerBI offers you for reporting!

The only drawback at the moment is that you cannot reschedule/reset slices from PowerBI but for my monitoring-use-case this was not a problem at all. Also, I did not include the SliceRun API in my report as this would increase the size of the data model a lot, so detailed log information is not available in my sample report.

The whole PowerBI template is available for download on my GitHub site: https://github.com/gbrueckl/Azure.DataFactory.PowerBIMonitor

Debugging Custom .Net Activities in Azure Data Factory

Azure Data Factory (ADF) is one of the newer tools of the whole Microsoft Data Platform on Azure. It is Microsoft’s Data Integration tool, which allows you to easily load data from you on-premises servers to the cloud (and also the other way round). It comes with some handy templates to copy data fro various sources to any available destination. However, when the Extract-Transform-Load (ETL) or ELT steps get more complicated you will hit the (current) out-of-the-box limits of Azure Data Factory pretty soon. But this is OK as ADF is a very open platform and allows you to integrate so called “Custom Activities”. These can either be .Net/C# Activities or HDInsight Activities. In this post we will focus on .Net Activities and how to develop and debug them in an efficient way.

A .Net Activity is basically just a .dll which implements a specific Interface (IDotNetActivity)and is then executed by the Azure Data Factory. To be more precise here, the .dll (and all dependencies) are copied to an Azure Batch Node which then executes the code when the .Net Activity is scheduled by ADF. So far so good, but the tricky part is to actually develop the .Net code, test, and debug it. Well, not the code itself but the more or less complex integration with the ADF Interface which you are very likely not familiar with in the beginning. In such cases it usually helps to run the code locally, step into the different code paths and examine the C# objects and their values. The problem is that you do not have a local instance of ADF on your workstation which you could use the start the .Net Activity and debug it interactively in Visual Studio.
So I wrote my own tool which you can add to the Solution that already contains the code of your Custom .Net Activity. Then you can simply link the CustomActivityDebugger to the JSON definitions and configurations of your ADF project, reference your custom code, configure some other things like SliceStart/SliceEnd and you are ready to go.
Once you start the CustomActivityDebugger it will read all ADF files and settings and basically create a local ADF environment which helps you to debug your custom .Net Activity using all settings and parameters as they would be passed in when the code is executed on the Azure Batch Node.

This little picture shows the CustomActivityDebugger in action – debugging custom .Net activities is now like debugging any other code:
Debugger_in_Action

All the sources including a simple ADF Project, a simple Custom Activity and setup instructions are available on my GitHub site:

https://github.com/gbrueckl/Azure.DataFactory.CustomActivityDebugger

Feel free to use it as it is and/or extend it to your needs.

Upcoming Conferences – Fall and Winter 2016

The next months are going to be quite busy for me. There will be a lot conferences and events which I will attend and where I am also speaking. I am always happy to attend those events as it is a great place to interact with other community members and learn new stuff about the latest and coolest new tools. So lets get started:

The first conference is the SQL Saturday #546 in Oporto (Portugal) on the 1st of October where I will be speaking about “What’s new in Analysis Services 2016”.
PASS_SQL_Saturday_546_Oporto
It is the first time for me to speak in Porto but from what I heard from other people it is said to be a really great event so I am really looking forward to it!

The week after – on the 6th of October – I will be speaking at the “Virtuelle SQL Server 2016 Konferenz” (German only) where I will be doing a session on SQL Server Polybase. As the name implies, it is virtual and you can attend from wherever you are at the time. Another important thing is that it is for free! You just need to register and you’ll get two full days of sessions around SQL Server and the Microsoft Data Platform.

In the same week, the SQL Saturday #555 in Munich (Germany) is taking place.
PASS_SQL_Saturday_555_Munich
Another free event for all community members. Again, just make sure you register in time to get a ticket! I will do a more advanced session on Polybase there with a more technical focus. So even if you attended the virtual event before, this might still be of great value for you!

Later in October, the world’s biggest conference about the Microsoft Data Platform and all other related technologies – the PASS Summit – will be in Seattle again, attracting 4,000+ attendees every year.
PASS_Summit_2016
It’s probably a bit late but you can still register for this great event. Unfortunately, I forgot to submit some sessions for this conference so I won’t be there this year. (Note to myself: submit sessions for 2017!) Having been there already several times, I can only recommend this conference to anyone who is into Microsoft, it’s Data Platform and everything around it.

For January 2017, I am very happy to announce that our SQL Saturday #579 in Vienna (Austria) is taking place the 4th year in a row now.
PASS_SQL_Saturday_579_Vienna
We had 250+ attendees last year and we are definitely aiming to hit the 300 mark this year! Again, it is free and you just need to register as soon as possible. (In case you cannot make it later on, please unregister as early as possible as otherwise you take away the slot of someone else). The Call for Papers is still open so if you want to speak at the event, simply submit your session details there.

Last but not least, the SQL Server Konferenz 2017 will be back in Darmstadt as in the previous years.
PASS_SQL_Konferenz_2017
This means 3-4 full days of sessions and trainings around the Microsoft Data Platform. Overall an awesome event organized by PASS Germany. It is also the 5th time in a row this year and believe me, they have organized and optimized everything down to the tiniest detail making it one of the best and biggest conferences in whole Europe!
I already submitted some sessions and I am hoping to be selected to speak there.

As you can see, a lot of things are ahead but I am really looking forward to all of them!

I just received my first Microsoft MVP Award!

Almost exactly 10 years ago I started my career in the field of Data Warehousing and Business Intelligence using the Microsoft Data Platform. Every since I have been active in the community, sharing my experiences and engaged with other people in various forums.
I have to admit that most of the things that I know today I originally learned from my strong interaction with the community and by solving (or trying to solve) problems of other community members that I came across when browsing e.g. the MSDN forums. This exchange is always a win-win situation for both sides as you grow with every challenge you master and that’s what the community is all about – having someone to discuss your issues with and solve them together!

Therefore I am very proud that I got awarded with the Microsoft MVP Award for my work in the community which includes blogging, answering forum questions, speaking at conferences and supporting local Microsoft events – or, to say it in a more general way, my engagement in the community!

Microsoft_MVP_Logo_Horizontal_Preferred_Cyan300_CMYK_300ppi

Thanks to everyone who I ever worked with, letting me learn on their problems, inspiring me to start my blog and keep being active in the community over the last years! Thank You!

Score whole PowerBI DataSets dynamically in Azure ML

One of the most requested features when it comes to Azure ML is and has always been the integration into PowerBI. By now we are still lacking a native connector in PowerBI which would allow us to query a published Azure ML web service directly and score our datasets. Reason enough for me to dig into this issue and create some Power Query M scripts to do this. But lets first start off with the basics of Azure ML Web Services.

Every Azure ML project can be published as a Web Service with just a single click. Once its published, it can be used like any other Web Service. Usually we would send a record or a whole dataset to the Web Service, the Azure ML models does some scoring (or any other operation within Azure ML) and then sends the scored result back to the client. This is straight forward and Microsoft even supplies samples for the most common programming languages. The Web Service relies on a standardized REST API which can basically be called by any client. Yes, in our case this client will be PowerBI using Power Query.
Rui Quintino has already written an article on AzureML Web Service Scoring with Excel and Power Query and also Chris Webb wrote a more generic one on POST Request in Power Query in general Web Service and POST requests in Power Query. Even Microsoft recently published an article how you can use the R Integration of Power Query to call a Azure ML Web Service here.

Having tried these solutions, I have to admit that they have some major issues:
1) very static / hard coded
2) complex to write
3) operate on row-by-row basis and might run into the API Call Limits as discussed here.
4) need a local R installation

As Azure ML usually deal with tables, which are basically Power Query DataSets, a requirement would be to directly use a Power Query DataSet. The DataSet has to be converted dynamically into the required JSON structure to be POSTed to Azure ML. The returned result, usually a table again, should be converted back to a Power Query DataSet. And that’s what I did, I wrote a function that does all this for you. All information that you have to supply can be found in the configuration of your Azure ML Web Service:
– Request URI of your Web Service
– API Key
– the [Table to Score]

the [Table to Score] can be any Power Query table but of course has to have the very same structure (including column names and data types) as expected by the Web Service Input. Then you can simply call my function:
InvokeFunction_PowerBI_DataSet_in_AzureML
Score_PowerBI_DataSet_in_AzureML

The whole process involves a lot of JSON conversions and is kind of complex but as I encapsulated everything into M functions it should be quite easy to use by simply calling the CallAzureMLService-function.

However, here is a little description of the used functions:
ToAzureMLJson – converts any object that is passed in as an argument to a JSON element. If you pass in a table, it is converted to a JSON-array. Dates and Numbers are formatted correctly, etc. so the result can the be passed directly to Azure ML.

AzureMLJsonToTable – converts the returned JSON back to a Power Query Table. It obeys column names and also data types as defined in the Azure ML Web Service output. If the output changes (e.g. new columns are added) this will be taken care of dynamically!

CallAzureMLService – uses the two function from above to convert a table to JSON, POST the JSON to Azure ML and convert the result back to a Power Query Table.

Known Issues:
As the [Table to Score] will probably come from a SQL DB or somewhere else, you may run into issues with Privacy Levels/Settings and the Formula Firewall. In this case make sure to enable Fast Combine for your workbook as described here.

The maximum timeout of a Request/Response call to an Azure ML Web Service is 100 seconds. If your call exceeds this limit, you might get an error message returned.I ran a test and tried to score 60k rows (with 2 numeric columns) at once and it worked just fine, but I would assume that you can run into some Azure ML limits here very easily with bigger data sets. As far as I know, these 100 seconds are for the Azure ML itself only. If it takes several minutes to upload your dataset in the POST request, than this is not part of this 100 seconds. If you are still hitting this issue, you could further try to split your table into different batches, score them separately and combine the results again afterwards.

 

So these are the steps that you need to do in order to use your Azure ML Web Service together with PowerBI:
1) Create an Azure ML Experiment (or use an existing)
2) Publish the Experiment as a Web Service
3) note the URL and the API Key of your Web Service
4) run PowerBI and load the data that you want to score
5) make sure that the dataset created in 4) has the exact same structure as expected by Azure ML (column names, data types, …)
6) call the function “CallAzureMLWebService” with the parameters from 3) and 5)
7) wait for the Web Service to return the result set
8) load the final table into PowerBI (or do some further transformations before)

And that’s it!

Download:
You can find a PowerBI workbook which contains all the functions and code here: CallAzureMLWebService.pbix
I used a simple Web Service which takes 2 numeric columns (“Number1” and “Number2”) and returns the [Number1] * [Number2] and [Number1] / [Number2]

PS: you will not be able to run the sample as it is as I changed the API Key and also the URL of my original Azure ML Web Service

Running Local R Scripts in Power BI

One of the coolest features of Power BI is that I integrates very well with other tools and also offers a lot of interfaces which can be used to extend this capabilities even further. One of those is the R Integration which allows you to run R code from within Power BI. R scripts can either be used as a data source or for visualizing your data. In this post I will focus on the data source component and show how you can use a locally stored R script and execute it directly in Power BI. Compared to the native approach where you need to embed the R code in the Power BI file, this has several advantages:

  • Develop R script in familiar external tool like RStudio
  • Integration with Source Control
  • Leverage Power BI for publishing and visualizing results

Out of the box Power BI only supplies one function to call R scripts as a data source which is R.Execute(text). Usually, when you use the wizard, it simply passes your R script as a hardcoded value to this function. Knowing the power of Power BI and its scripting language M for data integration made me think – “Hey, as R scripts are just text files and Power BI can read text files, I could also dynamically read any R script and execute it!”

Well, turns out to be true! So I created a little M function where I pass in the file-path of an existing R script and which returns a table of data frames which are created during the execution of the script. Those can then be used like any other data sets/tables within Power BI:
Power_BI_R_DataSource_dynamic_local_script

And here is the corresponding M code for the Power Query function:
(Thanks also to Imke Feldmann for simplifying my original code to the readable one below)

  1. let
  2.     LoadLocalRScript = (file_path as text) as table =>
  3. let
  4.     Source = Csv.Document(File.Contents(file_path),[Delimiter=#(lf), Columns=1, Encoding=1252, QuoteStyle=QuoteStyle.None]),
  5.     RScript =  Text.Combine(Source[Column1], "#(lf)"),
  6.     output = R.Execute(RScript)
  7. in
  8.     output
  9. in
  10.     LoadLocalRScript

First we read the R script like any other regular CSV file but we use line-feed (“#(lf)”) as delimiter. So we get a table with one column and one row for each line of our original R script.
Then we use Text.Combine() on our column to transform the single lines back into one long text resembling our original R script. This text can the be passed to the R.Execute() function to return the list of R data frames created during the execution of the script.

And that’s it! Any further steps are similar to using any regular R script which is embedded in Power BI so it is up to you on how you proceed from here. Just one thing you need to keep in mind is that changing the local R script might break the Power BI load if you changed or deleted any data frames which are referenced in Power BI later on.

One issues that I came across during my tests is that this approach does not work with scheduled refreshes in the Power BI Web Service via the Personal Gateway. The first reason for this is that it is currently not possible to use scheduled refresh if custom functions are involved. Even if you can work around this issue pretty easily by using the code from above directly in Power Query I still ran into issues with different privacy levels for the location of the R script and the R.Execute() function. But I will investigate into those issues and update this blog post accordingly (see UPDATE below).
For the future I hope that is fixed by Microsoft and Power BI allows you to execute remote scripts natively – but until then, this approach worked quite well for me.

UPDATE:
To make the refresh via the Personal Gateway work you have to enable “FastCombine”. How to do this is described in more detail here: Turn on FastCombine for Personal Gateway.

In case you are interested in more details on this approach, I am speaking at TugaIT in Lisbon, Portugal this Friday (20th of May 2016) about “Power BI for the Data Scientist” where I will cover this and lots of other interesting topics about the daily work of a data scientist and how PowerBI can used to ease them.

Downloads:
Power BI Workbook: Load_Local_R_Script_wFunction.pbix
Sample R Script: Sample_R_Script.r

Visualizing SSAS Calculation Dependencies using PowerBI

 

UPDATE: This does not work for Tabular Models in Compatibility Level 120 or above as they do not expose the calculation dependencies anymore!

 

One of my best practices when designing bigger SQL Server Analysis Services (SSAS) Tabular models is to nest calculations whenever possible. The reasons for this should be quite obvious:

  • no duplication of logics
  • easier to develop and maintain
  • (caching)

However, this also comes with a slight drawback: after having created multiple layers of nested calculations it can be quite hart to tell on which measures a top-level calculations actually depends on. Fortunately the SSAS engine exposes this calculation dependencies in one of its DMVs – DISCOVER_CALC_DEPENDENCY.
This DMV basically contains information about all calculations in the model:

  • Calculated Measures
  • Calculated Columns
  • Relationships
  • Dependencies to Tables/Columns

Chris Webb already blogged about this DMV some time ago and showed some basic (tabular) visualization within an Excel Pivot table (here). My post focuses on PowerBI and how can make the content of this DMV much more appealing and visualize it in a way that is very easy to understand.
As the DMV is built up like a parent-child hierarchy, I had to use a recursive M-function to resolve this self-referencing table which actually was the hardest part to do. Each row contains a link to a dependent object, which can have other dependencies again. In order to visualize this properly and let the user select a Calculation of his choice to see a calculation tree, I needed to expand each row with all of its dependencies, keeping their link to the root-node:

Here is a little example:

Object Referenced_Object
A B
B C

The table above is resolved to this table:

Root Object Referenced_Object
A A B
A B C
B B C

The Root-column is then used to filter and get all dependent calculations.
The PowerBI file also contains some other M-functions but those are mainly for ease-of-use and to keep the queries simple.

Once all the data was loaded into the model, I could use one of PowerBI’s custom visuals from the PowerBI Gallery – the Sankey Chart with Labels
SSAS_Visualizing_Tabular_Calc_Dependencies

Here is also an interactive version using the Publishing Feature of Power BI:

 

You can use the Slicers to filter on the Table, the Calculation Type and the Calculation itself and the visual shows all the dependencies down to the physical objects being Tables and Columns. This makes it a lot easier to understand your model and the dependencies that you built up over time.
I attached the sample-PowerBI-file below. You simply need to change the connectionstring to your SSAS Tabular Server and refresh the data connections.

The PowerBI-file (*.pbix) can be downloaded here: SSAS_CalcDependencies.pbix

C# Wrapper for Power BI REST API

Since the last major update last year, Power BI offers some APIs which can be used to interact with content and also data that is stored in Power BI. Microsoft provides a good set of samples on how to use the APIs on GitHub and also a an interactive APIARY web-UI which you can use to build and test API calls on-the-fly. However, it can still be quite cumbersome as you have to deal with all the REST API calls and the returned JSON on your own. So I decided to write a little C# Wrapper where you simply pass in your Azure AD Application Client ID and you can deal with all Object of the Power BI API as they were regular C# objects.

Here is a little example on how to list all available reports and get the EmbedURL of a given tile using the PowerBIClient:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using pmOne.PowerBI;
using pmOne.PowerBI.PowerBIObjects;

namespace SampleApplication
{
    class Program
    {
        static void Main(string[] args)
        {
            PowerBIClient pbic = new PowerBIClient(“ef4aed1a-9cab-4bb3-94ea-ffffffffffff”);

            Console.WriteLine(“Available Reports:”);
            foreach(PBIReport pbir in pbic.Reports)
            {
                Console.WriteLine(pbir.Name);
            }

            Console.WriteLine();
            Console.WriteLine(“Get EmbedURL for Tile [Retail Analysis Sample].[This Year’s Sales]”);
            Console.WriteLine(pbic.GetDashboardByName(“Retail Analysis Sample”).GetTileByName(“This Year’s Sales”).EmbedURL);

            Console.WriteLine(“Press <Enter> to exit …”);
            Console.ReadLine();
        }
    }
}

As you can see, its pretty simple and very easy to use, even for non-developers. You can find all the source-code and the sample application for download below. The code as I have written it is very likely not the best code possible, but it works for my needs, is straight forward, simple and saves me a lot of work and time when dealing with the PowerBI API. Also, if the API changes, you may need to adopt the code accordingly. However, for the future I hope that Microsoft provides some metadata so that VisualStudio can build all this code automatically using e.g. Swagger. But for the time being feel free to use, improve or extend my code Smile

SourceCode: PowerBIClient_Source.zip

Open Analysis Services Tabular Database Online

Those of you who have been working with SSAS Multidimensional in the past probably know that you can connect online to their SSAS database via Visual Studio / Data Tools.
Open_SSAS_DB_Visual_Studio

Any change you make (and save) online, will be directly deployed to the server and is the visible to the end-user immediately. This can be very convenient if you want to quickly check something or do some hot-fixes (e.g. changing the MDX script).  But be aware, structural changes might require you to process the changed and dependent objects so be sure about what you are changing online, especially if you are connecting to a productive environment!

I am quite sure that everyone who works with SSAS Tabular has also tried this feature for his Tabular database and ended up with the following error message:
”You are trying to connect to <servername> server running in tabular mode using the Tabular Model Designer. The option to open an Analysis Services Database is supported for servers running multidimensional mode only.”
Open_SSAS_DB_Visual_Studio_Tabular

So this simply does not work out of the box. However, there is a neat workaround which allows you to connect to your online SSAS Tabular database and do any changes you want. The idea behind this is to use the online database as our workspace database.

The first thing to do is to open Visual Studio / Data Tools and import the existing database into a new Project:
Visual_Studio_Import_SSAS_DB_Tabular

Then you need to select your workspace server. If there is no pop-up asking your for a workspace server, then you have already configured a default one which will be used in this case. As we are going to change this in the next step again, it does not really matter which workspace server you choose.

Now you are asked for the database which you want to import – choose the one that you want to connect to online. Once the import process is finished, Visual Studio already creates a workspace database for you and names it as follows: “<VS ProjectName>_<NT-Username>_<random GUID>” – in my case it was “TabularProject1_gbrueckl_b44f11de-21f4-4d18-bf67-0c25652fceba”. Any change you make in Visual Studio will be deployed directly to this database. Closing Visual Studio will unload the workspace database from memory by default.

Having all this information you probably can imagine where all this is leading to. The workspace database name and server can be configured and are persisted in the user-specific “Model.bim_<user>.settings”-file located in your project folder:

Model.bim_gbrueckl.settings
<?xml version=1.0 encoding=utf-8?>
<ModelUserSettings xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xmlns:xsd=http://www.w3.org/2001/XMLSchema>
  <ServerName>localhost\TAB2014</ServerName>
  <DatabaseName>AdventureWorksDW2012_Online_gbrueckl_b44f11de-21f4-4d18-bf67-0c25652fceba</DatabaseName>
  <DbRetention>OnDisk</DbRetention>
  <SnapshotBackup>DoNotKeepBackup</SnapshotBackup>
  <Annotations />
  <IsRecalcRequired>false</IsRecalcRequired>
  <IsImpersonationModified>false</IsImpersonationModified>
  <CheckForImpersonationWarning>false</CheckForImpersonationWarning>
  <RequirePastedTablesUpgrade>false</RequirePastedTablesUpgrade>
  <TruncatedTables />
  <IsPowerPivotMetadataScriptExecuted>false</IsPowerPivotMetadataScriptExecuted>
  <IsASImport>false</IsASImport>
  <IsPowerPivotImport>false</IsPowerPivotImport>
  <SelectedCompatibilityLevel>300</SelectedCompatibilityLevel>
</ModelUserSettings>

The settings you need to change here should be obvious – <ServerName> and <DatabaseName>. Change them to match your online SSAS Tabular server and database – the one you previously imported the project from.
But be sure to also change the <DbRetention>! Leaving the default “OnDisk” here would unload your database once you close Visual Studio what is definitely not what you want! You need to set this setting to “InMemory” to keep the database in memory – remember, we want to connect online but keep the database accessible once we are done.

Before you do all this changes you should close your Visual Studio solution completely to ensure there is nothing cached internally. Then simply open the .settings-file, do the changes described above and re-open your solution. If you have done everything correctly you should already see data for all of your tables:
Visual_Studio_SSAS_Tabular_with_Data
This is already the live-data that resides in your online database!

Congratulations! You are now connected online to your SSAS Tabular Database!

This approach can also be very useful if you are working with a backup of a SSAS Tabular database as it allows you to make online changes to the restored database and see all the data that exists in the database. Importing only the database project without this little hack would leave you with an empty database project which is very hard to work with if you need to create new calculations. Further, this also does not require you to reprocess the whole database which might not even be possible if you have no connection the the data sources underneath!

But before you get too much excited about this, there are some more things to keep in mind:

  • This is not officially supported by Microsoft
  • This was just experimental but proved (for me) to be very handy in some scenarios
  • Opening a SSAS Tabular Solution in Visual Studio sends and ALTER statement to the workspace database (in this case this is your productive database!) and updates the server with the metadata defined in your local .bim-file. If your server database changes frequently, this is probably not what you want as you would overwrite changes done by someone else recently. To work around this issue you would need to re-create/import your SSAS project every time before making any online changes to make sure you are always re-deploying the current state of the database when opening your local SSAS project.
  • I am not responsible for any data loss, damage or whatsoever!

If you want to use this approach to deploy hot-fixes and this happens frequently, you may also consider using a more professional approach for this – for example the BISM Normalizer Visual Studio Add-In which allows you to select the changes that you want to deploy to a target server, similar to schema compare for SQL Server.