Using R script in Skyline Batch schen19  2021-08-27 13:26
 

Dear Skyline Team,

I was just trying to set up a configuration for Skyline Batch. I added the path to the R script in "R script file path" then clicked "OK". However, I did not see the path appears afterwards. I have attached some screenshots to illustrate the problem.

Am I missing some steps? Is there anything I can do to fix this problem?

Thank you in advance.
Shimin

 
 
Ali Marsh responded:  2021-08-31 21:27

Hello Shimin,
Thank you for bringing this to our attention. You are not missing any steps, this was a bug inside Skyline Batch. It has since been fixed in Skyline Batch-daily, which you can download by uninstalling your current version of Skyline Batch and reinstalling from https://skyline.gs.washington.edu/software/SkylineBatch-daily/index.html. The main release of Skyline Batch will be updated with the bug fix soon, so
keep an eye out for updates if you decide not to switch to Skyline Batch-daily.
Please let us know if the problem persists after installing Skyline Batch-daily or updating the program (when the new update becomes available).
Thanks,
Ali

 
schen19 responded:  2021-09-01 06:39

Hi Ali,

Thank you very much for your help!

Best,
Shimin

 
roman sakson responded:  2022-12-19 04:21

Hello Skyline team,

I decided to add to this existing thread, since I also seem to have an R-related problem within Skyline Batch (SB), albeit a different one. I am trying to implement the whole Skyline Batch process for MRM data analysis, from importing raw data to exporting a report called 'MRMNormalizeR.csv' and running an R script with it. Everything seems fine until R starts, but then I am getting an error message:

ERROR: Error in file(file, "rt") : cannot open the connection
[19/12/2022 13:05:59] Calls: read.csv -> read.table -> file
[19/12/2022 13:05:59] In addition: Warning message:
[19/12/2022 13:05:59] In file(file, "rt") :
[19/12/2022 13:05:59] cannot open file 'MRMNormalizeR.csv': No such file or directory

It seems that R cannot find the exported report, however, SB is definitely creating it in the Analysis folder, where also the R script is located. The exported report is totally fine by itself and I can run the R script without problems using R Studio. However, it does not run within SB. I tried it with SB 21.2.0.389 and with 21.2.1.389. Is this issue known? I am attaching a log file and I am happy to provide more data but would rather upload it via a private link, if possible.

Thanking you in advance,
Roman

 
Brendan MacLean responded:  2022-12-19 12:14

Hi Roman,
I guess I would alter the R script to output the full path to the CSV file just before your call to read.csv. It seems strange to me that the error message is only mentioning the file name, and not its full path.

I think you need a bit more debug output to be completely sure that the expected values are making it to the right place, or if not, why not.

This line in your log file:

[19/12/2022 12:17:54] "C:\Users\Roman Sakson\Documents\Roman\PhD\PhD\Paper\Analytical_Chemistry\Manuscript\Data\Skyline_Batch\Analysis_Folder\22.11.12_MRM_GroupComp_Plotting_RS_Batch_FINAL.R" "C:\Users\Roman Sakson\Documents\Roman\PhD\PhD\Paper\Analytical_Chemistry\Manuscript\Data\Skyline_Batch\Analysis_Folder\MRMNormalizeR.csv"

Seems to indicate the Skyline Batch is calling the right R file with the full path. Though, I could imagine it would be helpful to have the entire command line including the path the R.exe used and exactly the parameters on the command line.

Hope this helps. Let us know what you learn. Definitely this pipeline has worked with several different R scripts and reports in our own testing.

--Brendan

 
roman sakson responded:  2022-12-19 16:40

Hi Brendan,

yes, you are right, if I explicitly set the working directory then the error seems gone (I added the line 44 in the minimal example that I am attaching

"setwd("C:/Users/Roman Sakson/Desktop/Analysis_Support")"

and it worked, but it does not work without it, even though the csv and the R script are both in the analysis folder.

I need to do some more testing with a second computer to learn in which way file paths are being preserved within a bcfg. I would like to offer a bcfg file to people who know enough Skyline to run it in Skyline Batch but who know virtually no R, ideally enabling them to get R script outputs without having to interact with the code itself (a little bit like the MSstats plugin within Skyline). Let's see, whether this is feasible at the moment.

I might bother you again soon. Thanks for your help!
Roman

 
roman sakson responded:  2022-12-27 11:52
Hello Brendan and team,

I hope that you are enjoying Christmas time! A brief update from my post above, maybe helpful to some future reader here: I could easily solve the problem of setting the working directory in R by using the "setwd(choose.dir())" command. In this way, every user on any computer can interactively navigate to his or her analysis folder for Skyline Batch (SB).

I would like to report another minor issue, maybe already known: I noticed that when the box for "Culture-invariant report" for report export in SB is checked, then no spaces remain within column headers. This way, "Protein Name" becomes "ProteinName". I realize that this can have advantages in R, but caused me some trouble because I was using my old R code, written before SB came out. When I manually used to export my reports from Skyline as csv or tsv, spaces within column headers would remain (as they do when I uncheck the "Culture-invariant report" box). My R code would then read the exported csv in via

data <- read.csv(file= "Anyname.csv",
                              header=TRUE,
                              sep=",",
                              stringsAsFactors = F,
                              as.is = T,
                              na.strings = 'NA')

which would result in spaces being replaced by dots in R, so "Protein Name" became "Protein.Name". As I would later manipulate columns, my code expected "Protein.Name" etc and could not work with the culture-invariant "ProteinName" input from SB. Unchecking the box solved the issue for me at the moment, but I am not quite sure whether this is a bug or a feature.

Best wishes,
Roman
 
Nick Shulman responded:  2022-12-27 12:02
I think things are working the way that they are supposed to.

When you export a report in Skyline using the "File > Export > Report" menu item, there is a dropdown at the bottom of the "Export Report" dialog where you can choose the language.
If you are exporting something that is intended to be read by a computer program, we would recommend that you choose "Invariant" as the language.
When the language is invariant, the column headers have their spaces removed, and all numbers are formatted to to as many decimal places as available.

-- Nick
 
roman sakson responded:  2022-12-27 12:38
Oh, ok, I never noticed the language, was always rather focusing on the "Save as type" dialog, where I choose whether I would want a csv or a tsv file. Maybe this is also because I never really use the "File > Export > Report" manu item, but rather greedily press the "Export..." button directly within the Document Grid once I am satisfied with the data (no language option there). Anyway, thanks Nick, once again I have learned something new about Skyline :).

I have a more general question though: the R code I would like to use together with Skyline Batch (SB) basically performs a group comparison across several conditions, just like Skyline would do it, but in an automated manner and with some more visualization features. Is there any way to perform new condition annotations with SB, after raw data have been imported but before the report is exported? There seem to be some refinement options available regarding previously saved group comparisons in Skyline, but those would need to be created manually by the user in Skyline, is that correct?

Roman
 
Nick Shulman responded:  2022-12-27 13:48
Maybe we should add a file type to the "save as" dialog so that you can use the "invariant" format without going through "File > Export > Report".

I wish we had a better name for this format. If the choices were the following, would people be able to guess what they mean:
CSV (Comma delimited) (*.csv)
TSV (Tab delimited) (*.tsv)
Invariant CSV (Comma delimited for computers) (*.csv)
(suggestions for better wording would be appreciated)

I am not sure I understand your question.

If you are already in R code, you might want to just invoke MSstats to do your group comparisons. When we wrote the group comparison feature in Skyline we copied the implementation from MSstats. There were some things that we could not copy because the necessary statistical functions were only available in R, not C#. So, MSstats does a better job with BioReplicate values. Skyline has to just average the observed values with the same Bio Replicate value before doing the linear regression, but MSstats is able to do something with a Linear Mixed Effects Model which is a little better.

If you are asking about how to set annotation values on replicates, there are a couple of things to do. One is that there is now a "--import-annotations" command line argument which will allow you to specify a CSV file with annotation values.
Another thing you might want to do is that if the annotation value can actually be found in the raw file's filename, then you could use the "Result File Rules" feature to set annotation values based on regular expressions applied to the filename.
Here's a page which talks about how to use Result File Rules:
https://skyline.ms/wiki/home/software/Skyline/page.view?name=ResultFileRules

-- Nick
 
schen19 responded:  2022-12-27 15:36

Hi Roman,

I had a similar question before and actually found the solution in one of the demo scripts provided by the Skyline team when Skyline Batch/Skyline Command Line was initially introduced. It has been a while and I couldn't find the original code now. Here is what I used in my script:

This R script uses the file path of the Skyline report as the working directory:

script_args <- commandArgs(trailingOnly = TRUE)

isCommand = length(script_args) > 0
if (isCommand) {
  reportPath <- script_args[1]
} else {
  errorCondition("Did not receive a report path argument")
}
reportPath = gsub("\\\\", "/", reportPath)
rootPath = dirname(reportPath) 
setwd(rootPath)

This may be helpful if someone wants to remove the step of manually specifying the working directory. Again, all credits to the Skyline team.

Shimin

 
roman sakson responded:  2022-12-27 16:05

Hi Shimin,

thank you, this is very good to know!

Best regards,
Roman

 
roman sakson responded:  2022-12-27 16:44

Hi Nick,

well, I am not sure that I would be able to guess what an invariant CSV is (but I am also not really a command line person, as you can easily tell from my struggles here). Not sure what a better wording could be. The normal csv seems to contain the same area numbers that are being displayed in my Skyline document grid, so I hope simply using those for calculations should not introduce too much bias. Are calculations in the background of Skyline being performed on more precise numbers?

I am happy with what Skyline can do and would like to leave MSstats out of it for now. However, the Result File Rules seem to be exactly what I am looking for: to define the condition based on a regular expression within the filename! Thanks a ton, I am getting close but it works for me only within the Skyline GUI but not via SB. Here is what I observe:

  1. I am saving my results rule and the custom report within the empty template document.

  2. I start my SB configuration, the template document gets copied to the analysis folder and the raw file import occurs.

  3. The exported report still contains an empty condition column, so it does not work and R reports an error. When I then open the copied template file from the analysis folder with Skyline 22.2, the condition column from my report is indeed empty, even though the results rule is checked and the Preview shows exactly what I want.

  4. I then go to Document Settings... -> Annotations, uncheck Condition and save. Skyline warns me that there is then a problem with the results rule, which I ignore. Then, I again go to Document Settings... -> Annotations, check Condition back on and save again. Now, the results rule kicks in and my custom report gets all the correct Conditions assigned according to the filename. If I would start SB now from the report export step, my R script would work.

There seems to be some kind of a "refresh" command missing for an existing results rule when using SB instead of the GUI. Or, maybe, I am missing a step?

Thank you for your help,
Roman

 
Nick Shulman responded:  2022-12-27 17:01
It looks like there might be a problem with applying result files rules when importing result files via the commandline.
If the problem is what I think it is, then importing result files one at a time on the commandline would work, but if you specify more than one file to import, the result file rules will not be applied.
I will try to figure out what is going on.
-- Nick
 
roman sakson responded:  2022-12-27 17:17
Hi Nick,

thank you! By the way, are common prefix and suffix parts of file names being ignored per default when using the command line for import?

Roman
 
roman sakson responded:  2023-07-06 15:34
Hi Skyline Team,

I decided to add to this older thread, since I am still working on combining my R-script with Skyline Batch (SB). As I mentioned before, my goal is to remove the necessity for the user to interact with the R-code directly via RStudio or other tools. I am using the "setwd(choose.dir())" command for setting the working directory in R via SB. This works really well, the same interactive windows explorer window that I was getting in RStudio also appears when I am using SB.

However, later on I would like to ask the user of the script, what protein from the dataset should be used as housekeeping for data normalization. In this example, "select" contains all proteins present in the corresponding Skyline custom report. We use the function dlgInput from the svDialogs package:

col.name <- dlgInput(paste0("Enter the name of the protein to normalize against \n(", paste(select, collapse = "\n "), "): "), Sys.info()["user"])$res

while (!(col.name %in% select)) {
  col.name <- dlgInput(paste0("Invalid input. Enter name of the protein from these options", paste(select, collapse = "\n"), ": "), Sys.info()["user"])$res
}

When I run this in RStudio, I get a nice pop-up prompt window with the names of all the proteins and then I can type my housekeeping protein name in (see screenshot). However, when the script runs in SB, it just waits but no pop-up window appears. I tested and SB definitely stops the script at this particular step. Can one tell, why it works for the setwd function but not for the dlgInput function? Any ideas how I might get what I want? I hope that my issue is clear, I am happy to provide more information.

Roman
 
Nick Shulman responded:  2023-07-06 16:34
I am not familiar with "dlgInput" at all, but I did find this documentation page:
https://search.r-project.org/CRAN/refmans/svDialogs/html/dlg_input.html

I don't think you are supposed to have that 'Sys.info()["user"]' there.
The second argument to the "dlgInput" function is "default", which I assume is the value that will be returned if the user does not type anything.
The example code on the documentation page has the dialog asking "Who are you?", so it makes sense that the default value would be the username from the system.

I see that dlgInput has an argument called "rstudio". I wonder whether you would get different behavior if you passed in either true or false for that argument.

Sometimes functions which are supposed to bring up a user interface behave differently when they are run as part of a script. The reason for this is that you would not want your script to get stuck when it's running on a server that is not attached to a keyboard or monitor or anything (in Java, this is called a "headless" environment).

I would recommend that you try to debug this by writing a very simple script which just brings up a dialog box and see whether you can get that to work in Skyline Batch.
-- Nick
 
roman sakson responded:  2023-07-07 17:13
Hi Nick,

you are right, 'Sys.info()["user"]' just puts my Windows user name into the prompt box, which I then have to replace by the name of the desired protein. Deleting it leaves the prompt box empty, as in my screenshot. It does not solve my problem with SB though. Interestingly, when I go for rstudio = false, it works neither in RStudio nor in SB. Rstudio = true is the default setting, which does not help me with SB either.

I wonder, why setwd(choose.dir()) works with SB, but svDialogs does not. However, the way how these packages are written might simply be to complicated for me, as I am not much of an R guy...

Do you happen to know of any package that works with SB and might be able to do what I want?

Roman
 
Nick Shulman responded:  2023-07-09 22:31
The R expert that I talked to thought that there might be helpful information in the Answer on this page:
https://stackoverflow.com/questions/48211294/running-a-gui-written-in-r-via-script

-- Nick
 
roman sakson responded:  2023-07-11 06:46
Thank you!