Formatting a text file in python - arcpy

This is my code:
import arcpy
arcpy.env.workspace = "C:\Users\Brett\Desktop\lesson_6\Lesson6_Data"
infc = "Cities.shp"
outputFile = open("C:\Users\Brett\Desktop\lesson_6\Lesson6_Data\Output_Cities1.txt", "w")
arcpy.GetParameterAsText(0)
fc = "Cities.shp"
fields = ["NAME","SHAPE#XY"]
with arcpy.da.SearchCursor(fc, fields) as cursor:
for row in cursor:
outputFile.write('{0}, {1}'.format(row[0],(row[1])))
print "done"
This is what the outputted format looks like:
Hiawatha, (-1050316.3479999993, 2067521.4093999993)Powder Wash, (-1025371.6007000003, 2059421.7783000004)Kings Canyon, (-852695.0120999999, 2036738.5595999993)Columbine, (-915047.0152000003, 2035509.35099999
But I want it to look like this:
FredRanch1_1, 529018.125025, 4108038.05548
FredRanch1_1, 529005.718792, 4108028.20659
FredRanch1_1, 528993.340503, 4108018.73931
FredRanch1_1, 528980.990158, 4108009.65364
FredRanch1_1, 528968.667757, 4108000.94958
etc....
Any suggestions on how to format is correctly?

SHAPE#XY returns a tuple of values, which you can access by their index position. You can add line breaks for each record by adding \n to the end of the line.
for row in cursor:
outputFile.write('{0}, {1}, {2}\n'.format(row[0], row[1][0], row[1][1]))

Related

Problem Replacing <br> Tags with Newline Using bs4

Problem: I cannot replace <br> tags with a newline character using Beautiful Soup 4.
Code: My program (the relevant portion of it) currently looks like
for br in board.select('br'):
br.replace_with('\n')
but I have also tried board.find_all() in place of board.select().
Results: When I use board.replace_with('\n') all <br> tags are replaced with the string literal \n. For example, <p>Hello<br>world</p> would end up becoming Hello\nworld. Using board.replace_with(\n) causes the error
File "<ipython-input-27-cdfade950fdf>", line 10
br.replace_with(\n)
^
SyntaxError: unexpected character after line continuation character
Other Information: I am using a Jupyter Notebook, if that is of any relevance. Here is my full program, as there may be some issue elsewhere that I have overlooked.
import requests
from bs4 import BeautifulSoup
import pandas as pd
page = requests.get("https://boards.4chan.org/g/")
soup = BeautifulSoup(page.content, 'html.parser')
board = soup.find('div', class_='board')
for br in board.select('br'):
br.replace_with('\n')
message = [obj.get_text() for obj in board.select('.opContainer .postMessage')]
image = [obj['href'] for obj in board.select('.opContainer .fileThumb')]
pid = [obj.get_text() for obj in board.select('.opContainer .postInfo .postNum a[title="Reply to this post"]')]
time = [obj.get_text() for obj in board.select('.opContainer .postInfo .dateTime')]
for x in range(len(image)):
image[x] = "https:" + image[x]
post = pd.DataFrame({
"ID": pid,
"Time": time,
"Image": image,
"Message": message,
})
post
pd.options.display.max_rows
pd.set_option('display.max_colwidth', -1)
display(post)
Any advice would be appreciated. Thank you for reading.
Instead of replacing after converting to soup, try replacing the <br> tags before converting. Like,
soup = BeautifulSoup(str(page.content).replace('<br>', '\n'), 'html.parser')
Hope this helps! Cheers!
P.S.: I did not get any logical reason why this is not working after changing into soup.
After experimenting with variations of
page = requests.get("https://boards.4chan.org/g/")
str_page = page.content.decode()
str_split = '\n<'.join(str_page.split('<'))
str_split = '>\n'.join(str_split.split('>'))
str_split = str_split.replace('\n', '')
str_split = str_split.replace('<br>', ' ')
soup = BeautifulSoup(str_split.encode(), 'html.parser')
for the better part of two hours, I have determined that the Panda data-frame prints the newline character as a string literal. Everything else indicates that the program is working as intended, so I assume this has been the problem all along.

Better way to clean product description using BeautifulSoup?

I have written following code to fetch product description from a site using BeautifulSoup-
def get_soup(url):
try:
response = requests.get(url)
if response.status_code == 200:
html = response.content
return BeautifulSoup(html, "html.parser")
except Exception as ex:
print("error from " + url + ": " + str(ex))
def get_product_details(url):
try:
soup = get_soup(url)
prod_details = dict()
desc_list = soup.select('p ~ ul')
prod_details['description'] = ''.join(desc_list)
return prod_details
except Exception as ex:
logger.warning('%s - %s', ex, url)
if __name__ == '__main__':
get_product_details("http://www.aprisin.com.sg/p-748-littletikespoptunesguitar.html")
In above code I am trying to convert description(a list) to string but getting below issue-
[WARNING] aprisin.py:82 get_product_details() : sequence item 0: expected str instance, Tag found - http://www.aprisin.com.sg/p-748-littletikespoptunesguitar.html
Output of description without converting description to string-
[<ul>
<li>Freestyle</li>
<li>Play along with 5 pre-set tunes: </li>
</ul>, <ul>
<li>Each string will play a note</li>
<li>Guitar has a whammy bar</li>
<li>2-in-1 volume control and power button </li>
<li>Simple and easy to use </li>
<li>Helps develop music appreciation </li>
<li>Requires 3 "AA" alkaline batteries (included)</li>
</ul>]
You're trying to join a list of Tags, but the join method needs str arguments. Try:
''.join([str(i) for i in desc_list])
You are passing a list of tags (Object) instead of string to join(). join() works with list of strings. Use the following code changes for join function:-
prod_details['description'] = ''.join([tag.get_text() for tag in desc_list])
or
prod_details['description'] = ''.join([tag.string for tag in desc_list])
In case you want the description along with html content, you can use the following:-
# this will preserve the html tags and indentation.
prod_details['description'] = ''.join([tag.prettify() for tag in desc_list])
or
# this will return the html content as string.
prod_details['description'] = ''.join([str(tag) for tag in desc_list])
desc_list is list of bs4.element.Tag. you should convert tag to string:
desc_list = soup.select('p ~ ul')
prod_details['description'] = str(desc_list[0])

Combine 1 field from multiple rows via AmpScript

I have a data extension which contains rows and columns such as:
emailAddress orderNumber firstName lastName customerOrder
cust1#gmail.com 1111 Bill Adams 2 brown shoes
cust1#gmail.com 1111 Bill Adams 2 green socks
cust1#gmail.com 1111 Bill Adams 1 orange backpack
cust1#gmail.com 2222 Bill Adams 2 pink gloves
cust2#gmail.com 3333 David Sherwood 5 yellow hats
What I'm trying to do is to create an order received email from this data, preferably without altering it from the source. So ideally the email output would group the customerOrder for each customer, based on the orderNumber. Then the customerOrder is concatenated and inserted into an email (note the above is simplified quite a bit, the customerOrder is actually HTML for insertion into an HTML table within the email).
So far I've been able to make this much very basic progress:
%%[
Set #customerOrder =
LookupOrderedRows("transactionsList",
"0",
"customerOrder",
"orderNumber",
"1111")
]%%
With this code I can see that I have 3 entries for order number 1111. But now I'm stuck. Do I need to create an if/then loop? Or is there some way to take the output from the LookupOrderedRows function and parse it for use in the HTML table within the email?
Using one of the lookup examples on my blog, you can do something like this:
%%[
var #rows, #row, #rowCount, #numRowsToReturn, #emailAddress, #i, #prevOrderNumber
set #emailAddress = AttributeValue("emailaddr")
set #numRowsToReturn = 0 /* 0 means all */
set #rows = LookupOrderedRows("transactionsList", #numRowsToReturn, "orderNumber", "emailAddress", #emailAddress)
set #rowCount = rowcount(#rows)
if #rowCount > 0 then
set #prevOrderNumber = ""
for #i = 1 to #rowCount do
var #orderNumber, #firstName, #lastName, #customerOrder
set #row = row(#rows,#i) /* get row based on loop counter */
set #orderNumber = field(#row,"orderNumber")
set #firstName = field(#row,"firstName")
set #lastName = field(#row,"lastName")
set #customerOrder = field(#row,"customerOrder")
/* output headings for first order or when order # changes */
if empty(#prevOrderNumber) or #prevOrderNumber != #orderNumber then
outputline(concat("<br>Order #:", #orderNumber))
outputline(concat("<br>Name: ", #firstName, " ", #lastName))
outputline(concat("<br>Line items:<br>"))
set #prevOrderNumber = #orderNumber
endif
outputline(concat("<br>",#customerOrder))
next #i
else
outputline(concat("<br>No transactionsList rows found"))
endif
]%%

Why does field value with space contain a new line when exporting to a text file in Progress 4GL?

I am able to export the data to a text file but the formatting in the text file is not good. For example when a filed value has space in it - it contains a new line in the file.
Sample data:
846438828|10121803||HEIN|KATIE|270||PEBBLE
CREEK|DR|||usa|GA|30605||7DAY|1|2|
842486060|1012||GUNTER|LEWELL|230||MCDUFFIE|DR|||ATHENS|GA|30605|7065430640|FRI-SUN|1|2|
889388948|101205||WEEKS|J D|183||MELL|ST|||ATHENS|GA|30605|7065481437|SUNONLY|1|2|
The value of the field streetname "PEBBLE "CREEK but in the report it looks like:
PEBBLE
CREEK
Why does this happen?
def var v-copies as inte no-undo.
def var v-phone as char format "x(16)" no-undo.
def var v-loc as char no-undo.
def var v-file as char format "x(30)" no-undo.
def var v-demoid as char format "x(20)" no-undo.
def var v-email as char format "x(30)" no-undo.
def var v-hostname as char format "x(20)" no-undo.
def var v-RouteIDs as char no-undo.
def var v-Product as char no-undo.
def var v-ExDir as char format "x(80)" no-undo.
def var v-LookBack as int no-undo init 90.
{tools/altpubs/audit/var.i}
{tools/altpubs/audit/procedures.i}
def stream sout.
def temp-table tt-demo
field entityid as int format ">>>>>>>>>9"
field answer like DemographicAnswer.Answer.
v-ConfigFile = search(v-ConfigFile).
if v-ConfigFile = ? then do:
message "config file config.csv was not found" view-as alert-box.
RETURN "ERROR".
end.
input from value(v-ConfigFile).
run ReadConfig.
input close.
for each tt-Config where tt-Config.Section = 'local' and
tt-Config.SectionValue <> ?:
v-loc = tt-Config.SectionValue.
case tt-Config.SettingName:
when 'ExchDir' then v-ExDir = tt-Config.SettingValue.
when 'Product' then v-Product = tt-Config.SettingValue.
when 'Routes' then v-RouteIDs = tt-Config.SettingValue.
when 'LookBack' then
do:
v-LookBack = integer(tt-Config.SettingValue) no-error.
if error-status:error then v-LookBack = 90.
end.
end.
end.
v-ExDir = v-Exdir + lc(v-loc) + "/".
file-info:file-name = v-ExDir.
if not( file-info:file-type begins "D") or file-info:file-type = ? then
do:
unix silent makedir value(v-ExDir) && chmod 777 value(v-ExDir).
file-info:file-name = v-ExDir.
end.
assign
v-File = v-ExDir + lc(v-Product) + "Audit" +
string(month(today),"99") + "-" +
string(day(today),"99") + "-" +
substring(string(year(today),"9999"),3) + ".txt".
for each DemographicAnswer where DemographicAnswer.DemographicId = v-RouteIDs
no-lock:
create tt-demo.
assign tt-demo.entityid = int(DemographicAnswer.EntityId)
tt-demo.answer = DemographicAnswer.Answer.
end.
output stream sout to value(v-file).
put stream sout unformatted
"HEADER B2 " string(today) skip.
for each tt-demo,
each Subscription no-lock
where Product = v-product
and SubscriptionID = tt-demo.entityid
and Subscriber = yes
and Getspaper = yes:
find last RouteSubscription of Subscription no-lock no-error.
if available routeSubscription then do:
for each Occupant of Subscription no-lock,
each Address of Subscription no-lock:
find OccupantPhone of Occupant no-lock no-error.
if available OccupantPhone then
v-phone = OccupantPhone.AreaCode + OccupantPhone.Phone.
else
v-phone = "".
find last OccupantEmail of Occupant no-lock no-error.
if available OccupantEmail then
v-email = OccupantEmail.EmailAddress.
else
v-email = "".
case DeliveryScheduleId:
when "MON-FRI" then v-copies = RouteSubscription.Copies[2].
when "FRI-SUN" then v-copies = RouteSubscription.Copies[1].
when "SUNONLY" then v-copies = RouteSubscription.Copies[1].
when "7DAY" then v-copies = RouteSubscription.Copies[1].
when "MON-SAT" then v-copies = RouteSubscription.Copies[2].
when "THUONLY" then v-copies = RouteSubscription.Copies[5].
when "WEDONLY" then v-copies = RouteSubscription.Copies[4].
when "SATSUN" then v-copies = RouteSubscription.Copies[1].
end case.
put stream sout unformatted
tt-demo.Answer "|"
Subscription.SubscriptionId "|"
Subscription.Product "|"
Occupant.LastName "|"
Occupant.FirstName "|"
trim( Address.HouseNumber) "|"
trim(Address.Postdirectional) "|"
trim(Address.StreetName) "|"
trim(Address.StreetSuffixId) "|"
trim(Address.postdirectional) "|"
trim(Address.UnitDesignatorID + trim(Address.UnitNumber)) "|"
Address.CityId "|"
Address.StateId "|"
Address.ZipCode "|"
v-phone "|"
Subscription.DeliveryScheduleId "|"
v-copies "|"
"2" "|"
v-email skip.
end.
end.
end.
put stream sout unformatted
"TRAILER ".
output stream sout close.
This could really only depend on a few things but it's hard to answer without seeing your code.
1) There really isn't a new line there. It only looks like it since your texteditor breaks the line when you open the file. If this is the case maximizing/changing window size of your editor would change where the new line is displayed.
2) There really is a new line in the field. That would be exported. If you are exporting values with EXPORT you could try to do something like this to replace new line characters:
EXPORT REPLACE(streetname, "~n","").
If this has effect you have new lines in your database.
3) Something is wrong with the way you export data. Since your not posting example code (it's always a good idea to do that) we cannot know about this.
My bets are on number 1 or 2. If you use a straightforward exporting method like EXPORT you really shouldn't get into trouble...
Pretty much what Jens said (I'd have left this as a comment, but Stack Overflow won't let me..).
Try opening the file(s) in Notepad++ and disable View->Wrap, or do a similar thing with an editor of your choice.
The linefeeds are in all likelyhood "not really there".
I had the same kind of problem some time ago. The problem was in the content of datafield, which had CHR(10) (line feed) and CHR(13) (carriage return) chars inputted by users with key. I found a simple solution to go around this by creating a function to convert those chars into '' and using it with PUT statements. I'll use two of your fields for instance ...
FUNCTION stringExport RETURNS CHAR
( INPUT p-input AS CHAR ):
RETURN TRIM(REPLACE(REPLACE(p-input,CHR(13),''),CHR(10),'')).
END FUNCTION. /* stringExport */
PUT STREAM sout UNFORMATTED
stringExport(Occupant.FirstName) "|"
stringExport(Address.HouseNumber) "|" SKIP.
Doing this maybe you solve your problem. Obviously you can substitute '' by ' '. It depends on your needs.
Hope it helps.
Looks like the field you are having trouble with is address.postdirectional. As others have said, the first thing to check is that there isn't a hidden line break in the data.
In the procedure editor / tramlines, try something simple like:
for each address no-lock:
display procedure.postdirectional.
end.
You should see any line breaks in the data at that point, too.

Get Data from Clipboard in Firefox 19.0

We upgraded to FF 19.0 from 12.0 and the clipboard access is completely restricted and I am not able to get data from clipboard.
In earlier vesions of FF, the following used to work.
netscape.security.PrivilegeManager.enablePrivilege("UniversalXPConnect");
var clip = Components.classes["#mozilla.org/widget/clipboard;1"].getService(Components.interfaces.nsIClipboard);
var trans = Components.classes["#mozilla.org/widget/transferable;1"].createInstance(Components.interfaces.nsITransferable);
UseCase:
For input text fields, when multi-line text is pasted, I would like to replace with separator of my choice instead of default space char as separator.
E.g: test1\n
test2\n
test3
On pasting this text in input text field in FF,
O/p seen: test1 test2 test3
O/p required: test1,test2,test3 (when separator is ',')
test1;test;test3 (when separator is ';')
Requirement:
The pasted text should be modified even before it is pasted to the text field and the only way seems to be access to clipboard.
I tried the following links but did not help.
https://support.mozilla.org/en-US/questions/948379
http://stackoverflow.com/questions/14809226/cut-copy-and-paste-is-not-working-for-firefox-15-onwords
I have tried modifying the user pref to allow clip board which did not work.
user_pref("capability.policy.policynames", "allowclipboard");
user_pref("capability.policy.allowclipboard.sites", mydomain);
user_pref("capability.policy.allowclipboard.Clipboard.cutcopy",
"allAccess");
I am not supposed to use flash objects to get access to clipboard (ZClip or ZeroClipboard).
Appreciate your responses. Thanks in advance.
try this way: http://jsfiddle.net/kUEBs/3/, works on firefox 23
<div style="border:1px solid grey;height:50px;" id="paste_ff" type="text" contenteditable></div>
<script type="text/javascript">
var pasteCatcher = document.getElementById('paste_ff');
document.getElementById('paste_ff').addEventListener('DOMSubtreeModified',function(){
if(pasteCatcher.children.length == 1){
var text = pasteCatcher.innerHTML; console.log(text);
//text2 = text.replace(/\n/g, "___"); console.log(text);
text2 = text.replace("<br>","____");
if(text2 != text){
pasteCatcher.innerHTML = text2;
}
}
},false);
</script>

Resources